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Abstract 

Consider the minimum spanning tree (MST) of the complete graph with n vertices, 
when edges are assigned independent random weights. Endow this tree with the graph 
distance renormalized by n 1 / 3 and with the uniform measure on its vertices. We show 
that the resulting space converges in distribution as n — > oo to a random measured 
metric space in the Gromov-HausdorffoProkhorov topology. We additionally show that 
the limit is a random binary IR-tree and has Minkowski dimension 3 almost surely. In 
particular, its law is mutually singular with that of the Brownian continuum random 
tree or any rescaled version thereof. Our approach relies on a coupling between the 
MST problem and the Erdds-Renyi random graph. We exploit the explicit description 
of the scaling limit of the Erdds-Renyi random graph in the so-called critical window, 
established in [4], and provide a similar description of the scaling limit for a "critical 




Figure 1: A simulation of the minimum spanning tree on A^ 30 oo- Black edges have weights 
less than 1/3000; for coloured edges, weights increase as colours vary from from red to purple. 
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1 Introduction 

1.1 A brief history of minimum spanning trees 

The minimum spanning tree (MST) problem is one of the first and foundational problems in 
the field of combinatorial optimisation. In its initial formulation by Boruvka [21], one is given 
distinct, positive edge weights (or lengths) for K n , the complete graph on vertices labelled 
by the elements of {1, . . . , n}. Writing {w e ,e G E(K n )} for this collection of edge weights, 
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one then seeks the unique connected subgraph T of K n with vertex set V(T) = {1, ... ,71} 
that minimizes the total length 

ee£(T) 

Algorithmically, the MST problem is among the easiest in combinatorial optimisation: pro- 
cedures for building the MST are both easily described and provably efficient. The most 
widely known MST algorithms are commonly called Kruskal's algorithm and Prim's algo- 
rithm} Both procedures are important in this work; as their descriptions are short, we 
provide them immediately. 

Kruskal's algorithm: start from a forest of n isolated vertices {1, . . . , n}. At each step, 
add the unique edge of smallest weight joining two distinct components of the current forest. 
Stop when all vertices are connected. 

Prim's algorithm: fix a starting vertex i. At each step, consider all edges joining the 
component currently containing % with its complement, and from among these add the unique 
edge of smallest weight. Stop when all vertices are connected. 

Unfortunately, efficient procedures for constructing MST's do not automatically yield 
efficient methods for understanding the typical structure of the resulting objects. To address 
this, a common approach in combinatorial optimisation is to study a procedure by examining 
how it behaves when given random input; this is often called average case or probabilistic 
analysis. 

The probabilistic analysis of MST's dates back at least as far as Beardwood, Halton and 
Hammersley [19] who studied the Euclidean MST of n points in IR d . Suppose that /x is an 
absolutely continuous measure on R d with bounded support, and let (Pi,i > 1) be i.i.d. 
samples from /i. For edge e = {i,j}, take w e to be the Euclidean distance between Pj and 
Pj. Then there exists a constant c = c(/x) such that if X n is the total length of the minimum 
spanning tree, then 

X n a .s. 
n (d-l)/d ~* °- 

This law of large numbers for Euclidean MST's is the jumping-off point for a massive amount 
of research: on more general laws of large numbers [69, 67, 75, 61], on central limit theorems 
([16, 41, 45, 46, 60, 77], and on the large-n scaling of various other "localizable" functionals 
of random Euclidean MST's ([56, 57, 58, 68, 42]. (The above references are representative, 
rather than exhaustive. The books of Penrose [59] and of Yukich [76] are comprehensive 
compendia of the known results and techniques for such problems.) 

From the perspective of Boruvka's original formulation, the most natural probabilistic 
model for the MST problem may be the following. Weight the edges of the complete graph K n 
with independent and identically distributed (i.i.d.) random edge weights {W e : e G E(K n )} 
whose common distribution /j is atomless and has support contained in [0, 00), and let M n 
be the resulting random MST. The conditions on \x ensure that all edge weights are positive 
and distinct. Frieze [31] showed that if the common distribution function F is differentiate 
at + and F'(0 + ) > 0, then the total weight X n satisfies 

F'(0 + )-E[Ag^C(3), (2) 

1 Both of these names are misnomers or, at the very least, obscure aspects of the subject's development; 
see Graham and Hell [34] or Schriver [65] for careful historical accounts. 
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whenever the edge weights have finite mean. It is also known that F'(0 + ) ■ X n A £(3) 
without any moment assumptions for the edge weights [31, 66]. Results analogous to (2) 
have been established for other graphs, including the hypercube [62], high-degree expanders 
and graphs of large girth [20], and others [32, 30]. 

Returning to the complete graph K n , Aldous [8] proved a distributional convergence 
result corresponding to (2) in a very general setting where the edge weight distribution is 
allowed to vary with n, extending earlier, related results [72, 18]. Janson [36] showed that 
for i.i.d. Uniform[0, 1] edge weights on the complete graph, rt}l 2 {X n — £(3)) is asymptotically 
normally distributed, and gave an expression for the variance that was later shown [39] to 
equal 6C(4)-4C(3). 

If one is interested in the graph theoretic structure of the tree M" rather than in informa- 
tion about its edge weights, the choice of distribution \i is irrelevant. To see this, observe that 
the behaviour of both Kruskal's algorithm and Prim's algorithm is fully determined once we 
order the edges in increasing order of weight, and for any distribution fi as above, ordering 
the edges by weight yields a uniformly random permutation of the edges. We are thus free to 
choose whichever distribution /i is most convenient, or simply to choose a uniformly random 
permutation of the edges. Taking \i to be uniform on [0, 1] yields a particularly fruitful con- 
nection to the now-classical Erdos-Renyi random graph process. This connection has proved 
fundamental to the detailed understanding of the global structure of M n and is at the heart 
of the present paper, so we now explain it. 

Let the edge weights {W e : e G E(K n )} be i.i.d. Uniform [0, 1] random variables. The 
Erdos-Renyi graph process (G(n,p),0 < p < 1) is the increasing graph process obtained 
by letting G(n,p) have vertices {1, . . . , n} and edges {e G E(K n ) : W e < p}. 2 For fixed p, 
each edge of K n is independently present with probability p. Observing the process as p 
increases from zero to one, the edges of K n are added one at a time in exchangeable random 
order. This provides a natural coupling with the behaviour of Kruskal's algorithm for the 
same weights, in which edges are considered one at a time in exchangeable random order, 
and added precisely if they join two distinct components. More precisely, for < p < 1 write 
M(n,p) for the subgraph of the MST M n with edge set {e G E(M n ) : W e < p}. Then for 
every < p < 1, the connected components of M(n,p) and of G(n,p) have the same vertex 
sets. 

In their foundational paper on the subject [26], Erdos and Renyi described the percolation 
phase transition for their eponymous graph process. They showed that for p = c/n with c 
fixed, if c < 1 (the subcritical case) then G(n,p) has largest component of size O(logn) in 
probability, whereas if c > 1 (the supercritical case) then the largest component of G(n,p) 
has size (1 + o p (l)) , ~f(c)n, where 7(c) is the survival probability of a Poisson(c) branching 
process. They also showed that for c > 1, all components aside from the largest have size 
O(logra) in probability. 

In view of the above coupling between the graph process and Kruskal's algorithm, the 
results of the preceding paragraph strongly suggest that "most of" the global structure of 
the MST M n should already be present in the largest component of M(n, c/n), for any 
c > 1. In order to understand M n , then, a natural approach is to delve into the structure 
of the forest M(n,p) for p ~ 1/n (the near-critical regime) and, additionally, to study how 
the components of this forest attach to one another as p increases through the near-critical 
regime. In this paper, we use such a strategy to show that after suitable rescaling of distances 

2 Later, it will be convenient to allow p <E K, and we note that the definition of G(n,p) still makes sense 
in this case. 
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and of mass, the tree M n , viewed as a measured metric space, converges in distribution to a 
random compact measured metric space ^ of total mass measure one, which is a random 
real tree in the sense of [27, 44]. 

The space ^# is the scaling limit of the minimum spanning tree on the complete graph. 
It is binary and its mass measure is concentrated on the leaves of j$ . The space shares 
all these features with the first and most famous random real tree, the Brownian continuum 
random tree, or CRT [12, 13, 14, 44]. However, ^# is not the CRT; we rule out this possibility 
by showing that jtft almost surely has Minkowski dimension 3. Since the CRT has both 
Minkowski dimension 2 and Hausdorff dimension 2, this shows that the law of is mutually 
singular with that of the CRT, or any rescaled version thereof. 

The remainder of the introduction is structured as follows. First, Section 1.2, below, we 
provide the precise statement of our results. Second, in Section 1.3 we provide an overview 
of our proof techniques. Finally, in Section 1.4, we situate our results with respect to the 
large body of work by the probability and statistical physics communities on the convergence 
of minimum spanning trees, and briefly address the question of universality. 

1.2 The main results of this paper 

Before stating our results, a brief word on the spaces in which we work is necessary. We 
formally introduce these spaces in Section 2, and here only provide a brief summary. First, let 
M. be the set of measured isometry-equivalence classes of compact measured metric spaces, 
and let dcHP denote the Gromov-Hausdorff-Prokhorov distance on A4: the pair (A^,dcHp) 
forms a Polish space. 

We wish to think of M n as an element of (M., dcHp)- In order to do this, we introduce a 
measured metric space M n obtained from M n by rescaling distances by n -1 / 3 and assigning 
mass 1/n to each vertex. The main contribution of this paper is the following theorem. 

Theorem 1.1. There exists a random, compact measured metric space j$ such that, as 
n — > oo ; 

M n A- Jt 

in the space (A^,dcHp)- The limit ^# is a random M>-tree. It is almost surely binary, and its 
mass measure is concentrated on the leaves of^. Furthermore, almost surely, the Minkowski 
dimension of j$ exists and is equal to 3. 

A consequence of the last statement is that jtft is not a rescaled version of the Brownian 
CRT jT, in the sense that for any non-negative random variable A, the laws of j$ and the 
space jT, in which all distances are multiplied by A, are mutually singular. Indeed, the 
Brownian tree has Minkowski dimension 2 almost surely. The assertions of Theorem 1.1 are 
contained within the union of Theorems 4.10 and 5.1 and Corollary 5.3, below. 

In a preprint [2] posted simultaneously with the current work, the first author of this 
paper shows that the unsealed tree M n , when rooted at vertex 1, converges in the local weak 
sense to a random infinite tree, and that this limit almost surely has cubic volume growth. 
The results of [2] form a natural complement to Theorem 1.1. 

As mentioned earlier, we approach the study of M n and of its scaling limit via a 
detailed description of the graph G(n,p) and of the forest M.(n,p), for p near 1/n. As is by 
this point well-known, it turns out that the right scaling for the "critical window" is given 
by taking p — 1/n + A/n 4//3 , for A 6 1, and for such p, the largest components of G(n,p) 
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typically have size of order n 2//3 and possess a bounded number of cycles [49, 9]. Adopting 
this parametrisation, for A £ M write 

(G n /,i>l) 

for the components of G(n, 1/n+A/n 4 / 3 ) listed in decreasing order of size (among components 
of equal size, list components in increasing order of smallest vertex label, say). For each i > 1, 
we then write G™' 1 for the measured metric space obtained from G™' 4 by rescaling distances 
by n -1 / 3 and giving each vertex mass n~ 2 ^ 3 , and let 

Gl = {GY^>l)- 

We likewise define a sequence (M"' 1 , i > 1) of graphs, and a sequence = (M™ ,l ,i > 1) of 
measured metric spaces, starting from M(n, l/n+ A/n 4 / 3 ) instead of G(n, 1/n + A/n 4 / 3 ). 

In order to compare sequences X = (Xj, i > 1) of elements of .M (i.e., elements of A / 1 N ), 
we let h p , for p > 1, be the set of sequences X £ A4 N with 

^diam(X) p + ^2f H {X i y < oo , 

i>l i>l 

and for two such sequences X = (Xj, i > 1) and X' = (XJ, i > 1), we let 

dist£ HP (X,X)= (^d GH p(X i ,Xy 

\i>l 

The resulting metric space (L p ,distQ HP ) is a Polish space. 

The second main result of this paper is the following (see Theorems 4.4 and 4.10 below). 

Theorem 1.2. Fix A £ R. T/ien t/iere exists a random sequence ^\ = {y$\,i > 1) of 
compact measured metric spaces, such that as n — >■ oo, 

M A " 4 ^ (3) 

m £/ie space (L4, distQ HP ). Furthermore, let be the first term jM\ of the limit sequence 
with its measure renormalized to be a probability. Then as A — > 00, jM\ converges in 
distribution to j$ in the space (A^dcHp)- 




1.3 An overview of the proof 

Theorem 1 of [4] states that for each A £ K, there is a random sequence 

of compact measured metric spaces, such that 

G'\ A & x , (4) 

in the space (L4, distQ HP ). (Theorem 1 of [4] is, in fact, slightly weaker than this because 
the metric spaces there are considered without their accompanying measures, but it is easily 
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strengthened; see Section 4.) The limiting spaces are similar to M-trees; we call them R- 
graphs. In Section 6 we define IR-graphs and develop a decomposition of IR-graphs analogous 
to the classical "core and kernel" decomposition for finite connected graphs (see, e.g., [38]). 
We expect this generalisation of the theory of IR-trees to find further applications. The 
main results of [3] provide precise distributional descriptions of the cores and kernels of the 
components of 

It turns out that, having understood the distribution of Cr", we can access the distribution 
of by using a minimum spanning tree algorithm called cycle breaking. This algorithm 
finds the minimum weight spanning tree of a graph by listing edges in decreasing order of 
weight, then considering each edge in turn and removing it if its removal leaves the graph 
connected. 

Using the convergence in (4) and an analysis of the cycle breaking algorithm, we will 
establish Theorem 1.2. The sequence is constructed from £f\ by a continuum analogue 
of the cycle breaking procedure. Showing that the continuum analogue of cycle breaking 
is well-defined and commutes with the appropriate limits is somewhat involved; this is the 
subject of Section 7. 

For fixed n, the process (M^ ,:L ,A G IR) is eventually constant, and we note that M n = 
liniA-5.00 M^' 1 In order to establish that M n converges in distribution in the space (A4, dcnr) 
as n — > 00, we rely on two ingredients. First, the convergence in (3) is strong enough to 
imply that the first component M^' 1 converges in distribution as n — > 00 to a limit in 
the space (A^,dcHp)- 

Second, the results in [5] entail Lemma 4.5, which in particular implies that for any e > 0, 



lim limsupP (d GH (M™' 1 , M n ) > e) = 0. (5) 

A-s. 



This is enough to prove a version of our main result for the metric spaces without their 
measures. In Lemma 4.8, below, we strengthen this statement. Let M"' 1 be the measured 
metric space obtained from M"' by rescaling so that the total mass is one (in M™' 1 we gave 
each vertex mass n~ 2 / 3 ; now we give each vertex mass |U(M^ ,:L )| _1 ). We show that for any 
e>0, 

lim limsupP (d G Hp(Mr'\ Mn ) > e) = 0. (6) 



Since (.M, dcHp) is a complete, separable space, the so-called principle of accompanying laws 
entails that 

in the space (.M,dcHp) for some limiting random measured metric space */# which is thus 

the scaling limit of the minimum spanning tree on the complete graph. Furthermore, still as 

a consequence of the principle of accompanying laws, jtft is also the limit in distribution of 

~4%\ as A — > 00 in the space (Ai, dcHp)- 

For fixed A G 1R, we will see that each component of ^\ is almost surely binary. Since 
is compact and (if the measure is ignored) is an increasing limit of ^\ as A — > 00, it will 

follow that is almost surely binary. 

To prove that the mass measure is concentrated on the leaves of we use a result of 

Luczak [47] on the size of the largest component in the barely supercritical regime. This 

result in particular implies that for all e > 0, 



lim limsupP 

A >00 ^ — ^ 



IV^M"' 1 ) 
2An 2 / 3 



> e = 0. 
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Since MJ^ 1 has n vertices, it follows that for any A G K, the proportion of the mass of M^ 1 
"already present in M^' 1 " is asymptotically negligible. But (5) tells us that for A large, with 
high probability every point of M^ 1 not in M^' 1 has distance oa^oo(I) from a point of M™' 1 , 
so has distance oa->oo(1) from a leaf of M^ 1 . Passing this argument to the limit, it will follow 
that *M almost surely places all its mass on its leaves. 

The statement on the Minkowski dimension of ^# depends crucially on an explicit de- 
scription of the components of ^\ from [3] , which allows us to estimate the number of balls 
needed to cover Along with a refined version of (5), which yields an estimate of the 
distance between and we are able to obtain bounds on the covering number of 

This completes our overview, and we now proceed with a brief discussion of related work, 
before turning to details. 

1.4 Related work 

In the majority of the work on convergence of MST's, inter-point distances are chosen so 
that the edges of the MST have constant average length (in all the models discussed above, 
the average edge length was o(l)). For such weights, the limiting object is typically a non- 
compact infinite tree or forest. As detailed above, the study bifurcates into the "geometric" 
case in which the points lie in a Euclidean space M d , and the "mean-field" case where the 
underlying graph is K n with i.i.d edge weights. In both standard approach is to pass 

directly to an infinite underlying graph or point set, and define the minimum spanning tree 
(or forest) directly on such a point set. 

It is not a priori obvious how to define the minimum spanning tree, or forest, of an infinite 
graph, as neither of the algorithms described above are necessarily well-defined (there may be 
no smallest weight edge leaving a given vertex or component). However, it is known [10] that 
given an infinite locally finite graph G = (V, E) and distinct edge weights w = {w e , e G E}, 
the following variant of Prim's algorithm is well-defined and builds a forest, each component 
of which is an infinite tree. 

INVASION PERCOLATION: for each v G V, run Prim's algorithm starting from v and call 
the resulting set of edges E v . Then let MSF(G,w) be the graph with vertices V and edges 

The graph MSF(G, w) is also described by the following rule, which is conceptually based 
on the coupling between Kruskal's algorithm and the percolation process, described above. 
For each r > 0, let G r be the subgraph with edges {e G E : w e < r}. Then an edge 
e = uv G E with w e = r is an edge of MSF(G,w) if and only if u and v are in distinct 
components of G r and one of these components is finite. 

The latter characterisation again allows the MSF to be studied by coupling with a per- 
colation process. This connection was exploited by Alexander and Molchanov [17] in their 
proof that the MSF almost surely consists of a single, one-ended tree for the square, trian- 
gular, and hexagonal lattices with i.i.d. Uniform [0, 1] edge weights and, later, to prove the 
same result for the MSF of the points of a homogeneous Poisson process in IR 2 [15]. Newman 
[54] has also shown that in lattice models in M. d , the critical percolation probability 6{p c ) is 
equal to if and only if the MSF is a proper forest (contains more than one tree). Lyons, 
Peres and Schramm [50] developed the connection with critical percolation. Among several 
other results, they showed that if G is any Cayley graph for which 8(p c (G)) = 0, then the 
component trees in the MSF all have one end almost surely, and that almost surely every 
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component tree of the MSF itself has percolation threshold p c = 1. (See also [71] for subse- 
quent work on a similar model.) For two-dimensional lattice models, more detailed results 
about the behaviour of the so-called "invasion percolation tree", constructed by running 
Prim's algorithm once from a fixed vertex, have also recently been obtained [24, 23]. 

In the mean-field case, one common approach is to study the MST or MSF from the 
perspective of local weak convergence [11]. This leads one to investigate the minimum 
spanning forest of Aldous' Poisson-weighted infinite tree (PWIT). Such an approach is used 
implicitly in [51] in studying the first 0{\/n) steps of Prim's algorithm on K n , and explicitly 
in [6] to relate the behaviour of Prim's algorithm on K n and on the PWIT. Aldous [8] 
establishes a local weak limit for the tree obtained from the MST of K n as follows. Delete 
the (typically unique) edge whose removal minimizes the size of the component containing 
vertex 1 in the resulting graph, then keep only the component containing 1. 

Almost nothing is known about compact scaling limits for whole MST's. In two dimen- 
sions, Aizenman, Burchard, Newman and Wilson [7] have shown tightness for the family of 
random sets given by considering the subtree of the MST connecting a finite set of points 
(the family is obtained by varying the set of points), either in the square, triangular or 
hexagonal lattice, or in a Poisson process. They also studied the properties of subsequential 
limits for such families, showing, among other results, that any limiting "tree" has Hausdorff 
dimension strictly between 1 and 2, and that the curves connecting points in such a tree 
are almost surely Holder continuous of order a for any a < 1/2. Recently, Garban, Pete, 
and Schramm [33] announced that they have proved the existence of a scaling limit for the 
MST in 2D lattice models. The MST is expected to be invariant under scalings, rotations 
and translations, but not conformally invariant, and to have no points of degree greater than 
four. In the mean-field case, however, we are not aware of any previous work on scaling 
limits for the MST. In short, the scaling limit that we identify in this paper appears to 
be a novel mathematical object. It is one of the first MST scaling limits to be identified, 
and is perhaps the first scaling limit to be identified for any problem from combinatorial 
optimisation. 

We expect */# to be a universal object: the MST's of a wide range of "high-dimensional" 
graphs should also have as a scaling limit. By way of analogy, we mention two facts. 
First, Peres and Revelle [63] have shown the following universality result for uniform spanning 
trees (here informally stated). Let {G n } be a sequence of vertex transitive graphs of size 
tending to infinity. Suppose that (a) the uniform mixing time of simple random walk on 

1 /2 

G n is o{Gn ), and (b) G n is sufficiently "high-dimensional", in that the expected number 
of meetings between two random walks with the same starting point, in the first jCn] 1 / 2 
steps, is uniformly bounded. Then after a suitable rescaling of distances, the spanning tree 
of G n converges to the CRT in the sense of finite-dimensional distributions. Second, under a 
related set of conditions, van der Hofstad and Nachmias [35] have very recently proved that 
the largest component of critical percolation on G n in the barely supercritical phase has the 
same scaling as in the Erdos-Renyi graph process (we omit a precise statement of their result 
as it is rather technical, but mention that their conditions are general enough to address the 
notable case of percolation on the hypercube). However, a proof of an analogous result for 
the MST seems, at this time, quite distant. As will be seen below, our proof requires detailed 
control on the metric and mass structure of all components of the Kruskal process in the 
critical window and, for the moment, this is not available for any other models. 
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2 Metric spaces and types of convergence 



The reader may wish to simply skim this section on a first reading, referring back to it as 
needed. 

2.1 Notions of convergence 
Gromov— HausdorfT distance 

Given a metric space (X,d), we write [X,d] for the isometry class of (X,d), and frequently 
use the notation X for either (X, d) or [X, d] when there is no risk of ambiguity. For a metric 
space (X, d) we write diam((X, d)) = sup x yeX d(x, y), which may be infinite. 

Let X = (X, d) and X' = (X', d') be metric spaces. If C is a subset of X x X', the 
distortion dis(C) is defined by 

dis(C) = sup{\d(x,y) - d'{x',y')\ : (x,x f ) G C, (y,y') G C}. 

A correspondence C between X and X' is a measurable subset of X x X' such that for every 
x G X, there exists x' G X' with (x,x') G C and vice versa. Write C(X,X') for the set of 
correspondences between X and X'. The Gromov-Hausdorff distance dcH(X, X') between 
the isometry classes of (X, d) and (X', d') is 

d GH (X,X') = ^inf{dis(C) : C G C(X,X')}, 

and there is a correspondence which achieves this infimum. (In fact, since our metric spaces 
are assumed separable, the requirement that the correspondence be measurable is not strictly 
necessary.) It can be verified that dcH is indeed a distance and, writing M. for the set of 
isometry classes of compact metric spaces, that (Ai, dcii) is itself a complete separable metric 
space. 

Let (X, d, (xi, . . . , Xk)) and (X', d', (x[, . . . , x' k )) be metric spaces, each with an ordered 
set of k distinguished points (we call such spaces k-pointed metric spaces) 3 . We say that these 
two Appointed metric spaces are isometry- equivalent if there exists an isometry : X — > X' 
such that <f>(xi) = x\ for every i G {1, . . . , k}. As before, we write [X, d, (xi, . . . , Xk)] for the 
isometry equivalence class of (X, d, (xi, . . . , Xk)), and denote either by X when there is little 
chance of ambiguity. 

The k-pointed Gromov-Hausdorff distance is defined as 

d^, H (X, X') = - inf {dis(C) : C G C(X, X') such that {x u xj) G C, 1 < i < k} . 

Much as above, the space (MS k \ dQ H ) of isometry classes of /c-pointed compact metric spaces 
is itself a complete separable metric space. 

The Gromov— HausdorfF—Prokhorov distance 

A compact measured metric space is a triple (X, d, /x) where (X, d) is a compact metric space 
and \x is a (non-negative) finite measure on (X, £>), where B is the Borel u-algebra on (X, d). 

3 When k = 1, we simply refer to pointed (rather than 1-pointed) metric spaces, and write (X, d, x) rather 
than (X, d, (x)) 
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Given a measured metric space (X, d, //), a metric space (X',d!) and a measurable function 
cf) : X — > X', we write for the push-forward of the measure \i to the space (X', d'). Two 
compact measured metric spaces (X, <i, /i) and (X',d',p,') are called is ometry- equivalent if 
there exists an isometry (f> : (X, d) — » (X', <f) such that </>*/! = //. The isometry-equivalence 
class of (X, d, /i) will be denoted by [X, d, /i] . Again, both will often be denoted by X when 
there is little risk of ambiguity. If X = (X, d, /i) then we write mass(X) = fi(X). 

There are several natural distances on compact measured metric spaces that generalize 
the Gromov-Hausdorff distance, see for instance [28, 74, 52, 1]. The presentation we adopt 
is still different from these references, but closest in spirit to [1] since we are dealing with 
arbitrary finite measures rather than just probability measures. In particular, it induces the 
same topology as the compact Gromov-Hausdorff-Prokhorov metric of [1]. 

If (X, d) and (X', d') are two metric spaces, let M(X, X') be the set of finite non-negative 
Borel measures on X x X'. We will denote by p,p' the canonical projections from X x X' 
to X and X'. 

Let fi and p! be finite non-negative Borel measures on X and X' respectively. The 
discrepancy of 7r G M(X, X') with respect to /j, and p! is the quantity 

D(7r; //,//) = || /x -p*7r|| + ||//' -p'^W , 

where ||z/|| is the total variation of the signed measure v. Note in particular that D(tt; p, //) > 
|/i(X) — /i'(X')|, by the triangle inequality and the fact that \\u\\ > |^(1)|, where is the 
total mass of v. If p and // are probability distributions (or have the same mass), a measure 
7r G M(X, X') with D(tt; p,p') = is a coupling of /i and \Jt! in the standard sense. 

Recall that the Prokhorov distance between two finite non-negative Borel measures p 
and p! on the same metric space (X, d) is given by 

inf{e > : p{F) < p'{F e ) + e and p'{F) < p(F e ) + e for every closed F C X} . 

An alternative distance, which generates the same topology but more easily extends to the 
setting where /i and /i' are measures on different metric spaces, is given by 

inf {e > : D(tt; fi, //) < e, n({(x, x') G X x X : d(x, x') > e}) < e for some tt G M(X, X)} . 

To extend this, we replace the condition on {(x,x') G X x X : > e} by an analogous 

condition on the measure of the set of pairs lying outside the correspondence. More precisely, 
let X = (X,d,fi) and X' = (X',d',fi') be measured metric spaces. The Gromov-Hausdorff- 
Prokhorov distance between X and X' is defined as 

d GHP (X,X') = inf j^dis(C) V D(tt; //,//') V tt(C c ) j , 

the infimum being taken over all C G C(X, X') and tt G M(X, X'). Here and elsewhere we 
write x V y — max(x, y) (and, likewise, x Ay = min(x, y)). 

Just as for dcH, it can be verified that dcHP is a distance and that writing M. for the set 
of measured isometry-equivalence classes of compact measured metric spaces, (A^,dcHp) is 
a complete separable metric space (see, e.g., [1]). 

Note that d G Hp((X d, 0), (X', d', 0)) = d G H((X d), (X', d')). In other words, the mapping 
[X,d] i — y [X, d, 0] is an isometric embedding of (7W,dcH) into and we will sometimes 
abuse notation by writing [X, d] G A4. Note also that 

d GH (X,X') V | M (X) -h'(X')\ < d GHP (X,X / ) < ^(diam(X) + diam(X')) V (fi(X) + fi'(X')) . 
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In particular, if Z is the "zero" metric space consisting of a single point with measure 0, then 



d GHP (X, Z) = dia ^ (X) V fi(X) , for every X = [X, d, fx] . (7) 

Finally, we can define an analogue of dcHP for measured isometry-equivalence class of 
spaces of the form (X, d, x, fi) where x = (xi, . . . , Xk) are points of X and /j, = (/ii, . . . , fj,i) are 
finite Borel measures on X. If (X, d, x, /j,), (X 1 , d', x', //) are such spaces, whose measured, 
pointed isometry classes are denoted by X, X', we let 

d^ P (X, X') = inf | ^dis(C) V max (D(vr,; ^ ^) V vr, (C c )) 

where the infimum is over all C G C(X, X') such that (x^x^) G C, 1 < i < k and all 
7i j G M(X, X'), 1 < j ' < I. Writing Ai h ' 1 for the set of measured isometry-equivalence classes 
of compact metric spaces equipped with k marked points and I finite Borel measures, we 
again obtain a complete separable metric space (•M^dgjjp). We will need the following 
fact, which is in essence [52, Proposition 10], except that we have to take into account more 
measures and/or marks. This is a minor modification of the setting of [52], and the proof is 
similar. 

Proposition 2.1. LetX n = (X n , d n , x n , fi n ) converge to = (X^, d^, x^, in M k '\ 
and assume that the first measure /i* of [i n is a probability measure for every n G N U 
{oo}. Let y n be a random variable with distribution y} n , and let x n = (x\, . . . , x^y^). Then 
(X n , d n , x n , fj, n ) converges in distribution to (X^, doo, x^, in A4 k+1 ' 1 . 



Sequences of metric spaces 

We now consider a natural metric on certain sequences of measured metric spaces. For p > 1 
and X = (X i; z > 1), X' = (X' i5 i > 1) in M N , we let 

dist^ HP (X,X / )= (^d GHP (X l ,X:) 

\i>l 

If X G Ai n for some n G N, we consider X as an element of Ai n by appending to X an 
infinite sequence of copies of the "zero" metric space Z. This allows us to use dist GHP to 
compare sequences of metric spaces with different numbers of elements, and to compare finite 
sequences with infinite sequences. In particular, let Z = (Z, Z, . . .), and 

L p = {X G M N : dist GHP (X, Z) < oo} , 

so, by (7), X G L p if and only if the sequences (diam(Xj),i > 1) and (/ij(Xj),i > 1) are in 
£ P (N). We leave the reader to check that (L p , dist GHP ) is a complete separable metric space. 




2.2 Some general metric notions 

Let (X, d) be a metric space. For x G X and r > 0, we let B r (x) = {y G X : d(x, y) < r} 
and B r (x) = {y G X : d(x,y) < r}. We say (X, d) is degenerate if |X| = 1. As regards 
metric spaces, we mostly follow [22] for our terminology. 
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Paths, length, cycles 

Let C([a, b],X) be the set of continuous functions from [a, b] to X, hereafter called paths with 
domain [a, b] or paths from a to b. The image of a path is called an arc; it is a simple arc if 
the path is injective. If / G C([a, b],X), the length of / is defined by 

len(/) = sup I J2d(f(ti„i),f(ti)) : k > 1, t ,*i, ■ ■ ■ , h G [a,b],t <h< ... <t k \ . 

^ i=l ) 

If len(/) < oo, then the function ip : [a, b] — > [0,len(/)] defined by ip(t) = len(/|[ ai t]) is non- 
decreasing and surjective. The function / o <^ _1 , where <y9 _1 is the right-continuous inverse 
of <p, is easily seen to be continuous, and we call it the path / parameterized by arc-length. 
The intrinsic distance (or intrinsic metric) associated with (X, d) is the function d\ defined 

by 

di(x, y) = inf{len(/) : / G C([0, 1], X), f(0) = x, /(l) = y} . 

The function d\ need not take finite values. When it does, then it defines a new distance on 
X such that d <d\. The metric space (X, d) is called intrinsic if d = d\. Similarly, if Y C X 
then the intrinsic metric on Y is given by 

y) = inf{len(/) : / G C([0, 1], Y), /(0) = x, /(l) = . 

Given x, y G X, a geodesic between x and y (also called a shortest path between x and y) 
is an isometric embedding / : [a, b] — > X such that f(a) = x and f(b) = y (so that obviously 
len(/) = b — a = d(x,y)). In this case, we call the image Im(/) a geodesic arc between x 
and y. 

A metric space (X, d) is called a geodesic space if for any two points x, y there exists 
a geodesic between x and y. A geodesic space is obviously an intrinsic space. If (X, d) is 
compact, then the two notions are in fact equivalent. Also note that for every a; in a geodesic 
space and r > 0, B r (x) is the closure of B r (x). Essentially all metric spaces (X,d) that we 
consider in this paper are in fact compact geodesic spaces. 

A path / G C([a, b],X) is a local geodesic between x and y if f(a) = x, fib) = y, and for 
any t G [a, b] there is a neighborhood V of t in [a, b] such that f\y is a geodesic. It is then 
straightforward that b — a = len(/). (Our terminology differs from that of [22], where this 
would be called a geodesic. We also note that we do not require x and y to be distinct.) 

An embedded cycle is the image of a continuous injective function / : §i — >■ X, where 
§i = {z G C : \z\ = 1}. The length len(/) is the length of the path g : [0, 1] — > X defined by 
g(t) = f{e 2mt ) for < t < 1. It is easy to see that this length depends only on the embedded 
cycle c = Im(/) rather than its particular parametrisation. We call it the length of the 
embedded cycle, and write len(c) for this length. A metric space with no embedded cycle is 
called acyclic, and a metric space with exactly one embedded cycle is called unicyclic. 

IR-trees and IR-graphs 

A metric space X = (X, d) is an IR-tree if it is an acyclic geodesic metric space. If (X, d) is 
an R-tree then for x G T, the degree deg x (x) of x is the number of connected components 
of X \ {x}. A leaf is a point of degree 1; we let £(X) be the set of leaves of X. 

A metric space (X, d) is an M,-graph if it is locally an IR-tree in the following sense. Note 
that by definition an IR-graph is connected, being a geodesic space. 
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Definition 2.2. A compact geodesic metric space (X,d) is an IR-graph if for every x G X, 
there exists e > such that (B e (x),d\BJ x )) is an IR-iree. 

Let X = (X, d) be an IR-graph and fix x G X. The degree of x, denoted by deg x (x) and 
with values in NU {oo}, is defined to be the degree of x in B e (x) for every e small enough 
so that (B £ (x), d) is an IR-tree, and this definition does not depend on a particular choice of 
e. If Y C X and x G Y, we can likewise define the degree deg Y {x) of x in Y as the degree 
of x in the IR-tree (B e (x) PI \ {x}, where Y(x) is the connected component of Y that 

contains x, for any e small enough. Obviously, deg y (x) < deg y ,(a;) whenever Y C Y' . 

Let 

£(X) = {i G I : deg x (a;) = 1} , skel(X) = {x G X : deg x (a;) > 2} . 

An element of £(X) is called a Zea/ o/ X, and the set skel(X) is called the skeleton of X. 
A point with degree at least 3 is called a branchpoint of X. We let k(X) be the set of 
branchpoints of X. If X is, in fact, an IR-tree, then skel(X) is the set of points whose removal 
disconnects the space, but this is not true in general. Alternatively, it is easy to see that 

skel(X)= |J c\{x,y} 

x,y£X 
c&F(x,y) 

where for x,y G X, T(x,y) denotes the collection of all geodesic arcs between x and y. 
Since (X, d) is separable, this may be re-written as a countable union, and so there is a 
unique cr-finite Borel measure £ on X with £(lm(g)) = len(<?) for every injective path g, 
and such that £(X \ skel(X)) = 0. The measure £ is the Hausdorff measure of dimension 1 
on X, and we refer to it as the length measure on X. If (X, d) is an IR-graph then the set 
{x G X : deg x (x) > 3} is countable (as is classically the case for compact IR-trees), and 
hence this set has measure zero under I. 

Definition 2.3. Let (X,d) be an M>-graph. Its core, denoted by core(X), is the union of 
all the simple arcs having both endpoints in embedded cycles ofX. If it is non-empty, then 
(core(X),c?) is an M.-graph with no leaves. 

The last part of this definition is in fact a proposition, which is stated more precisely 
and proved below as Proposition 6.2. Since the core of X encapsulates all the embedded 
cycles of X, it is intuitively clear that when we remove core(X) from X, we are left with a 
family of IR-trees. This can be formalized as follows. Fix x G X \ core(X), and let / be a 
shortest path from x to core(X), i.e., a geodesic from x to y G core(X), where y G core(X) 
is chosen so that len(/) is minimum, (recall that core(X) is a closed subspace of X). This 
shortest path is unique, otherwise we would easily be able to construct an embedded cycle c 
not contained in core(X), contradicting the definition of core(X). Let a(x) be the endpoint 
of this path not equal to x, which is thus the unique point of core(X) that is closest to x. 
By convention, we let a(x) = x if x G core(X). We call a(x) the point of attachment of x. 

Proposition 2.4. The relation x ~ y -<=>- a(x) = at(y) is an equivalence relation on X . 
If [x] is the equivalence class of x, then ([x],d) is a compact IR-tree. The equivalence class 
[x] of a point x G core(X) is a singleton if and only if deg x (x) = deg core / X \ (x) . 
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Proof. The fact that ~ is an equivalence relation is obvious. Fix any equivalence class [x]. 
Note that [x] fl core(X) contains only the point a(x), so that [x] is connected and acyclic by 
definition. Hence, any two points of [x] are joined by a unique simple arc (in [x]). This path 
is moreover a shortest path for the metric d, because a path starting and ending in [x], and 
visiting X \ [x], must pass at least twice through a(x) (if this were not the case, we could 
find an embedded cycle not contained in core(X)). The last statement is easy and left to the 
reader. □ 

Corollary 2.5. // (X, d) is an M.-graph, then core(X) is the maximal closed subset of X 
having only points of degree greater than or equal to 2. 

Proof. If Y is closed and strictly contains core(X), then we can find x £ Y such that 
d(x, core(X)) = d(x,a(x)) > is maximal. Then Y fl [x] is included in the set of points 
y £ [x] such that the geodesic arc from y to a(x) does not pass through x. This set is an 
M-tree in which x is a leaf, so deg y (a;) < 1. □ 

Note that this characterisation is very close to the definition of the core of a (discrete) 
graph. Another important structural component is conn(X), the set of points of core(X) 
such that X \ {x} is connected. Figure 2 summarizes the preceding definitions. The space 
conn(X) is not connected or closed in general. Clearly, a point of conn(X) must be contained 
in an embedded cycle of X, but the converse is not necessarily true. A partial converse is as 
follows. 

Proposition 2.6. Let x £ core(X) have degree deg x (x) = 2 and suppose x is contained in 
an embedded cycle of X. Then x £ conn(X). 

Proof. Let c be an embedded cycle containing x. Fix y,y' £ X \ {x}, and let <p,<f)' be 
geodesies from y, y' to their respective closest points z, z' £ c. Note that z is distinct from x 
because otherwise, x would have degree at least 3. Likewise, z' ^ x. 

Let 0" be a parametrisation of the arc of c between z and z' that does not contain x, 
then the concatenation of <p, <p' and the time-reversal of the path <p" is a path from y to y', 
not passing through x. Hence, X \ {x} is connected. □ 

Let us now discuss the structure of core(X). Equivalently, we need to describe IR-graphs 
with no leaves, because such graphs are equal to their cores by Corollary 2.5. 

A graph with edge-lengths is a triple (V, E, (/(e), e £ E)) where (V, E) is a finite connected 
multigraph, and /(e) £ (0, oo) for every e £ E. With every such object, one can associate an 
IR-graph without leaves, which is the metric graph obtained by viewing the edges of (V, E) as 
segments with respective lengths /(e). Formally, this IR-graph is the metric gluing of disjoint 
copies Y e of the real segments [0, /(e)], e £ E according to the graph structure of (V, E). We 
refer the reader to [22] for details on metric gluings and metric graphs. In Section 6, we will 
prove the following result. 

Theorem 2.7. An H-graph with no leaves is either a cycle, or is the metric gluing of a 
finite connected multigraph with edge-lengths in which all vertices have degree at least 3. The 
associated multigraph, without the edge-lengths, is called the kernel of X, and denoted by 
ker(X) = (Jfe(X),e(X)). 

The precise definition of ker(X), and the proof of Theorem 2.7, both appear in Section 6.3. 
For a connected multigraph G = (V,E), the surplus s(G) is \E\ — |V| + 1. For an IR- 
graph (X,d), we let s(X) = s(ker(X)) if ker(X) is non-empty. Otherwise, either (X,d) is an 
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Figure 2: An example of an IR-graph (X, d), emphasizing the structural components. core(X) 
is in thick line (black and red), with conn(X) in red. The subtrees hanging from core(X) are 
in thin blue line. Kernel vertices are represented as large dots. An example of the projection 
a : X — > core(X) is provided. 

R-tree or core(X) is a cycle. In the former case we set s(X) = 0; in the latter we set s(X) = 1. 
Since the degree of every vertex in ker(X) is at least 3, we have 2|e(X)| = X^efc(x) deg(t>) > 
3|fc(X)|, and so if s(X) > 1 we have 

|*(X)|<2s(X)-2, (8) 
with equality precisely if ker(X) is 3-regular. 

3 Cycle-breaking in discrete and continuous graphs 
3.1 The cycle-breaking algorithm 

Let G = (V, E) be a finite connected multigraph. Let conn(G) be the set of of all edges 
e E E such that G\e = (V,E\ {e}) is connected. 

If s(G) > 0, then G contains at least one cycle and conn(G) is non-empty. In this case, 
let e be a uniform random edge in conn(G), and let K(G, •) be the law of the multigraph 
G\e. If s(G) = 0, then K(G, •) is the Dirac mass at G. By definition, K is a Markov kernel 
from the set of graphs with surplus s to the set of graphs with surplus (s — 1) V 0. Writing 
K n for the n-fold application of the kernel K, we have that K n (G, ■) does not depend on n 
for n > s(G). We define the kernel K°°(G, •) to be equal to this common value: a graph 
has law K°°(G, •) if it is obtained from G by repeatedly removing uniform non-disconnecting 
edges. 

Proposition 3.1. The probability distribution K°°(G, •) is the law of the minimum spanning 
tree of G, when the edges E are given exchangeable, distinct random edge-weights. 
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Proof. We prove by induction on the surplus of G the stronger statement that K°°(G, •) is the 
law of the minimum spanning tree of G, when the weights of conn(G) are given exchangeable, 
distinct random edge- weights. For s(G) = the result is obvious. 

Assume the result holds for every graph of surplus s, and let G have s(G) — s + 1. Let 
e be the edge of conn(G) with maximal weight, and condition on e and its weight. Then, 
note that the weights of the edges in conn(G) \ {e} are still in exchangeable random order, 
and the same is true of the edges of conn(G \ e). By the induction hypothesis, K S {G \ e, •) is 
the law of the minimum spanning tree of G \ e. But e is not in the minimum spanning tree 
of G, because by definition we can find a path between its endpoints that uses only edges 
having strictly smaller weights. Hence, K S (G \ e, •) is the law of the minimum spanning tree 
of G. On the other hand, by exchangeability, the edge e of conn(G) with largest weight is 
uniform in conn(G), so the unconditional law of a random variable with law K S (G \ e, •) is 
K s+1 (G r ). □ 



3.2 Cutting the cycles of an M-graph 

There is a continuum analogue of the cycle-breaking algorithm in the context of IR-graphs, 
which we now explain. Recall that conn(X) is the set of points x of the IR-graph X = (X, d) 
such that x G core(X) and X \ {x} is connected. For x G conn(X), we let (X x ,d x ) be the 
space X "cut at x" . Formally, it is the metric completion of (X \ {x}, d x \{ x }), where dx\{ x } 
is the intrinsic distance: dx\{ x }{y, z ) is the minimal length of a path from y to z that does 
not visit x. 

Definition 3.2. A point x G X in a measured M>-graph X = (X,d,fi) is a regular point if 
x G conn(X) ; and moreover /i({x}) = and deg x (x) = 2. A marked space (X,d,x,fi) G 
Ai 1 ' 1 , where (X,d) is an M>-graph and x is a regular point, is called safely pointed. We say 
that a pointed M>-graph (X, d, x) is safely pointed if (X, d, x, 0) is safely pointed. 

If a; is a regular point then /x induces a measure (still denoted by /j) on the space X x 
with the same total mass. We will give a precise description of the space X x = (X x ,d x ,fi) 
in Section 7.1: in particular, it is a measured IR-graph with s(\ x ) = s(X) — 1. 

Note that if s(X) > and if 

L = £(■ n conn(X)) 

is the length measure restricted to conn(X), then L-almost every point is regular. Also, L is 
a finite measure by Theorem 2.7. Therefore, it makes sense to let /C(X, •) be the law of X x , 
where x is a random point of X with law L/L (conn (X)). If s(X) = we let /C(X, •) = 5{x}- 
Again, K. is a Markov kernel from the set of measured IR-graphs with surplus s to the set of 
measured IR-graphs of surplus (s — 1) V 0, and /C n (X, •) = /C S ^(X, •) for every n > s(X): we 
denote this by /C°°(X, •). 

In Section 7 we will give details of the proofs of the aforementioned properties, as well as 
of the following crucial result. For r G (0, 1) we let A r be the set of measured IR-graphs with 
s(X) < 1/r and whose core, seen as a graph with edge-lengths (fc(X),e(X), (£(e),e G e(X))), 
is such that 

min £(e) > r , and > 1(e) < 1/r 

eee(X) ^— ' 

eee(X) 

(if s(X) = 1, this should be understood as the fact that core(X) is a cycle with length in 
[r, 1/r].) 
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Theorem 3.3. Fix r e (0, 1). Let (X n ,d n ,fi n ) be a sequence of measured M>-graphs in A r , 
converging as n — > oo to (X,d,fi) e A r in (A^,c1ghp)- Then /C°°(X n , ■) converges weakly to 
/C°°(X,-). 

3.3 A relation between the discrete and continuum procedures 

We can view any finite connected multigraph G = (V,E) as a metric space (V,d), where 
d(u,v) is the least number of edges in any chain from u to v. We may also consider the 
metric graph (m(G),d m (G)) associated with G by treating edges as segments of length 1 
(this is sometimes known as the cable system for the graph G [73]). Then (m(G) , dm(G)) is 
an IR-graph. Note that <1gh((V, d) , (m(G) , d m (a))) < 1 an d, in fact, (m(G) , d m (a)) contains 
an isometric copy of (V, d). Also, temporarily writing H for the graph-theoretic core of G, 
that is, the maximal subgraph of G of minimum degree two, it is straightforwardly checked 
that core(m(G)) is isometric to (m(H) , d m (H)) ■ 

Conversely, let (X, d) be an IR-graph, and let Sx be the set of points in X with degree 
at least three. We say that (X, d) has integer lengths if all local geodesies between points in 
Sx have lengths in Z + . Let 

v(X) = {x G X : d(x,S x ) e 

and note that if (X,d) is compact and has integer lengths then necessarily \Sx\ < oo and 
\v (X)| < oo. The removal of all points in v (X) separates X into a finite collection of paths, 
each of which is either an open path of length one between two points of i>(X), or a half-open 
path of length strictly less than one between a point of v(X) and a leaf. Create an edge 
between the endpoints of each such open path, and call the collection of such edges e(X). 
Then let 

2(X) = KX),e(X)); 
we call the multigraph g(X) the graph corresponding to X (see Figure 3). 




Figure 3: Left: an IR-graph with integer lengths. The points of degree at least three are 
marked green, and the remaining points of u(X) are marked red. Centre: the collection of 
paths after the points of v(X.) are removed. The paths with non-integer length are drawn in 
red. Right: the graph g(X). 

Now, fix an IR-graph (X, d) which has integer lengths and surplus s(X). Let x±, . . . , x s (x) 
be the points sampled by the successive applications of K to X: given x\, . . . , Xi, the point Xj+i 
is chosen according to L/L(X) on conn(X a , 1) ... )!Bi ), where X^,...,^ is the space X cut successively 
at x%,X2, . . . Note that Xj can also naturally be seen as a point of X for 1 < % < s(X). 
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Since the length measure of v(X) is 0, almost surely x,i 7^ v(X) for all 1 < i < s(X). 
Thus, each point Xi, 1 < i < s(X), falls in a path component of core(X) \ v (X) which itself 
corresponds uniquely to an edge in e, G e(X). Note that the edges ej, 1 < i < s(X), are 
distinct by construction. Then let go (X) = g(X), and for 1 < i < s(X), write 

g i (X) = {v{X),e(X)\{e u ...,e i }). 

By construction, the graph gi(X) is connected and has surplus precisely s(X) — i, and in 
particular g s rx){X) is a spanning tree of g(X). Let cut(X) be the random IR-graph resulting 
from the application of JC°°, that is obtained by cutting X at the points x 1: . . . , x s nq in our 
setting. 

Proposition 3.4. We have dcH(cut(X), g s ( X )(X)) < 1. 

Proof. First, notice that g s (x)(X) and g(cut(X)) are isomorphic as graphs, so isometric as 
metric spaces. Also, as noted in greater generality at the start of the subsection, we auto- 
matically have dcH(cut(X), g(cut(X))) < 1. □ 

Proposition 3.5. The graph g(cut(X)) is identical in distribution to the minimum-weight 
spanning tree of g(X) when the edges of e G e(X) are given exchangeable, distinct random 
edge weights. 

Proof. When performing the discrete cycle-breaking on g(X), the set of edges removed from 
g(X) is identical in distribution to the set {ei, . . . , e s (x)} of edges that are removed from g(X) 
to create g s pq(X), so g s (x)(X) has the same distribution as the minimum spanning tree by 
Proposition 3.1. Furthermore, as noted in the proof of the preceding proposition, g s (x)(X) 
and g(cut(X)) are isomorphic. □ 

3.4 Gluing points in M-graphs 

We end this section by mentioning the operation of gluing, which in a vague sense is dual 
to the cutting operation. If (X, d, n) is an IR-graph and x, y are two distinct points of X, 
we let X. x,y be the quotient metric space [22] of (X, d) by the smallest equivalence relation 
for which x and y are equivalent. This space is endowed with the push-forward of [i by the 
canonical projection p. It is not difficult to see that X. x,y is again an IR-graph, and that the 
class of the point z = p(x) = p{y) has degree deg Xx , y (z) = deg x (a;) + deg x (y). Similarly, if 
71 is a finite set of unordered pairs {xi, yi} with Xi 7^ yi in X, then one can identify x\ and 
yi for each i, resulting in an IR-graph X^. 

4 Convergence of the MST 

We are now ready to state and prove the main results of this paper. We begin by recalling 
from the introduction that we write M n for the MST of the complete graph on n vertices 
and M n for the measured metric space obtained from M n by rescaling the graph distance 
by n~ 1//3 and assigning mass 1/n to each vertex. 
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4.1 The scaling limit of the Erdos— Renyi random graph 

Recall that G(n,p) is the Erdos-Renyi random graph. For A G R, we write 

G3t = (G^,i>l) 

for the components of G(n, l/n+A/n 4 / 3 ) listed in decreasing order of size (among components 
of equal size, list components in increasing order of smallest vertex label, say). For each i > 1, 
we then write G 1 ^ 1 for the measured metric space obtained from G™' 1 by rescaling the graph 
distance by riT 1 / 3 and giving each vertex mass n~ 2 ^ 3 , and let 

G n x = (GY^>l). 

In a moment, we will state a scaling limit result for G x ; before we can do so, however, 
we must introduce the limit sequence of measured metric spaces Sf> = > 1). We 

will do this somewhat briefly, and refer the interested reader to [3, 4] for more details and 
distributional properties. 

First, consider the stochastic process (W x (t),t > 0) defined by 

W x (t) := W(t) + Xt-^, 

where (W(t),t > 0) is a standard Brownian motion. Consider the excursions of W\ above 
its running minimum; in other words, the excursions of 

B x (t) := W\(t) - min W x (s) 

0<s<t 

above 0. We list these in decreasing order of length as (e 1 , e 2 , . . .) where, for % > 1, o % is the 
length of e\ (We suppress the A-dependence in the notation for readability.) For defmiteness, 
we shift the origin of each excursion to 0, so that e l : [0, a 1 } — > R+ is a continuous function 
such that e l (0) = e % (a % ) = and e % (x) > otherwise. 

Now for % > 1 and for x, x' G [0, a 1 ], define a pseudo-distance via 

dHx, x') = 2eHx) + 2eHx') - 4 inf eHt). 

x/\x'<t<xVx' 

Declare that x ~ x' if <i J (x, x') = 0, so that ~ is an equivalence relation on [0,<r 1 ]. Now let 
T % = [0,cr J ]/~ and denote by t % : [0,a l ] — > T % the canonical projection. Then d l induces a 
distance on T\ still denoted by d\ and it is standard (see, for example, [44]) that (T\d l ) 
is a compact M-tree. Write fi l for the push-forward of Lebesgue measure on [0,a l ] by r l , so 
that (T\ d v ,pL l ) is a measured K-tree of total mass o % . 

We now decorate the process B x with the points of an independent homogeneous Pois- 
son process in the plane. We can think of the points which fall under the different ex- 
cursions separately. In particular, to the excursion e l , we associate a finite collection 
V % = {(x J ' J , y M ), 1 < j < s 1 } of points of [0,a l ] x [0, oo) which are the Poisson points 
shifted in the same way as the excursion e l . (For defmiteness, we list the points of V 1 in 
increasing order of first co-ordinate.) Conditional on e 1 ,^ 2 , . . ., the collections V 1 ,V 2 ', . . . 
of points are independent. Moreover, by construction, given the excursion e l , we have 

s l ~ Poisson(J Q <T e l (t)dt). Let z l ' j = inf{t > x hj : e l (t) = y l ' j } and note that, by the 
continuity of e l , z 1 ^ < a % almost surely. Let 

K i = {{T\x^)y{z^)},\< 3 <s 1 }. 
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Then 1Z 1 is a collection of unordered pairs of points in the M-tree T l . We wish to glue 
these points together in order to obtain an IR-graph, as in Section 3.4. We define a new 
equivalence relation ~ by declaring x ~ x' in T % if {x, x'} G 1Z % . Then let X 1 be T l j ~, 
let d l be the quotient metric [22], and let fi l be the push- forward of fx 1 to X i . Then set 
£fj = (Af 1 , d l , fi 1 ) and £f> = i > 1). We note that for each i > 1, the measure /? is almost 
surely concentrated on the leaves of T % . As a consequence, // is almost surely concentrated 
on the leaves of X i . 

Given an M-graph X, write r(X) for the minimal length of a core edge in X. Then 
r(X) = mi{d{u, &(X)} whenever ker(X) is non-empty. We use the convention that 

r(X) = oo if core(X) = and r(X) = £(c) if X has a unique embedded cycle c. Recall also 
that s(X) denotes the surplus of X. 

Theorem 4.1. Fzx A el. T/ien as n — > oo, we have the following joint convergence 

(s(G n /),i>l)A(s(&i),i>l), and 
(r(G7),z>l) 4(r(^),*>l). 

T/ie /irst convergence takes place in the space (L4, distQ HP ). T/ie others are in the sense of 
finite- dimensional distributions. 

Let i\ = {x = (xi,x 2 , . . .) : xi > x 2 > ■ ■ ■ > 0, YliLi x \ < °°}- Corollary 2 of [9] gives 
the following joint convergence 

(mass(G r l ,t ) , i > 1) -4 (mass(^),z > 1), and 
(s(GY),i>l)A(s(&i),i>l), 

where the first convergence is in \\ ■ || 2 ) and the second is in the sense of finite-dimensional 
distributions. (Of course, mass(^) = <y % and s(5f{) = s l .) Theorem 1 of [4] extends this to 
give that, jointly, 

(Gz\i>i)i(#i,i>i) (io) 

in the sense of dist£ H , where for X, Y G M N , dist^ H (X,Y) = (E^i d GH(X i , Y*) 4 ) 1/4 . We 
need to improve this convergence from dist£ H to distQ HP . First we show that we can get 
GHP convergence componentwise. We do this in two lemmas. 

Lemma 4.2. Suppose that (7~, d, //) and (T',d',fi') are measured M.-trees, that {(xi,t/i),l < 
i < k} are pairs of points in T and that {(x' i ,y' i ), 1 < i < k} are pairs of points in T' . Then 
if (7~,d,p) and (T', d', p!) are the measured metric spaces obtained by identifying X, and yi 
in T and x\ and y[ in T' , for all 1 < i < k, we have 

d G M(t,d^),(T',d',v)) < (k + i)d 2 ^ P ((r,d,^fi),(r,d',^,fi f )) 

where x = (xi, . . . , Xk, y 1; . . . , y^j and similarly for x'. 

Proof. Let C and 7r be a correspondence and a measure which realise the Gromov-Hausdorff- 
Prokhorov distance between (T, d, x, /i) and (T 7 , d', x', /i'); write 5 for this distance. Note 
that by definition, (xj,xQ G C and (y^y^) G C for 1 < i < k. Now make the vertex 
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identifications in order to obtain T and T'; let p : T — > T and p' : T — > r be the 
corresponding canonical projections. Then 

C = {(j)(x),p'(x')) : {x,x') e K'} 

is a correspondence between T and T'. Let fr be the push-forward of the measure ir by 
{p,p'). Then D(jt;jl,fi') < 5 and tt(C c ) < 5. Moreover, by Lemma 21 of [4], we have 
dis(C) < (k + 1)5. The claimed result follows. □ 

Lemma 4.3. Fix % > 1. T/ien as n oo, 



in (X,d G Hp)- 

Proof. This proof is a fairly straightforward modification of the proof of Theorem 22 in [4] , 
so we will only sketch the argument. Consider the component G"'\ Since we have fixed 
A and i, let us drop them from the notation and simply write G n for the component, and 
similarly for other objects. Write n 2//3 S n 6 Z + for the size of G n and S n G Z + for its surplus. 
We can list the vertices of this graph in depth-first order as t>o, v%, . . . , v n 2/3- S n_ 1 . Let H n (k) 
be the graph distance of vertex Vk from vq. Then it is easy to see that n~ 2 ^ 3 H n encodes a 
tree T n on vertices v , vi, . . . , f n 2/3 S n_i with metric d T n such that dyn(v fc , v ) = H n {k). We 
endow T n with a measure by letting each vertex of T n have mass n~ 2//3 . 

Next, let the pairs {ii, jx}, 32}, ■ ■ ■ , {is n , js n } gi ye the indices of the surplus edges 
required to obtain G n from T n , listed in increasing order of first co-ordinate. In other words, 
to build G n from T n , we add an edge between Vi k and Vj k for each 1 < k < S n (and re- 
multiply distances by n 1 / 3 ). Recall that to get G n from G n , we rescale the graph distance by 
r?r 1//3 and assign mass rT 2 ^ to each vertex. It is straightforward that G n is at GHP distance 
at most n^ 1 ^S n from the metric space G n obtained from T n by identifying vertices Vi k and 
v jk for all 1 < k < S n . 

From the proof of Theorem 22 of [4], we have jointly 

(£» S»)4(<7,S) 

(n- l ' z H n ( [n 2 'H\ ), < t < E n ) 4 (2e(t),Q <t<a) 
{{n- 2 'H k) n- 2 / 3 3k }, 0<k<S n }A {{x k , z k }, 1 < k < s}. 

By Skorokhod's representation theorem, we may work on a probability space where these 
convergences hold almost surely. Consider the IR-tree (7~, d-y) encoded by 2e and recall 
that t is the canonical projection [0, cr] — y T. We extend r to a function on [0, 00) by 
letting r(t) = r(t A cr). Let r] n : [0, 00) — > {v ,vi, . . . ^v^/a^^} be the function defined by 
if(t) = V|_ n 2/3 t j A ( n ,2/3 E n_ 1 ). Set 

C n = {{ V n {t), t{?)) :t,t'E [0, E n V a], \t - t'\ < 5 n }, 

where 5 n converges to slowly enough, that is, 

5 n > max \n~ 2/3 i k - x k \ V \n~ 2/3 j k - z k \ . 

l<k<s 
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Then C n is a correspondence between T n and T that contains (vi k ,x ) and (vj k ,z k ) for 
every k e {1,2,..., s}, and with distortion going to 0. Next, let tt u be the push-forward of 
Lebesgue measure on [0, S n A a] under the mapping (r] n , r). Then the discrepancy of 7r n with 
respect to the uniform measure fi n on T n and the image \x of Lebesgue measure by r on T 
is |S n - cr|, and 7r n ((C n ) c ) = 0. 

For all large enough n, S n = s, so let us assume henceforth that this holds. Then, writing 



v = [v h , . . ., v is , v h i 



Vj s ) and x = (x 1 , . . . , x k , z 1 , . . . , z k ), we have 



d^ P ((T",v, / i n ),(T,x, yU )) < (^dis(C-) J V |S» - a\ -+ 

almost surely, as n — > oo. By Lemma 4.2 we thus have dGnp(G n ,^) — > almost surely, as 
n — > oo. Since dcHP (G n , G n ) < n-^Sn ->■ 0, it follows that dGHp(G n , ^) ^ as well. □ 

Proof of Theorem 4-1- By (9), (10), Lemma 4.3 and Skorokhod's representation theorem, we 
may work in a probability space in which the convergence in (9) and in (10) occur almost 
surely, and in which for every i > 1 we almost surely have 

d G Hp(G7,^)^0 (11) 

as n — > oo. Now, for each i > 1, 

dGHP^r,^) < 2max{diam(G'7),diam(^),mass(G™' i ),mass(^)}. 
The proof of Theorem 24 from [4] shows that almost surely 



lim Vdiam(^) 4 = 0, 

i=N 

and (10) then implies that almost surely 

oo 

lim lim sup N diam(G^'*) 4 = . 

i=N 

The l\ convergence of the masses entails that almost surely 

00 

lim V mass(^) 4 = 

i=N 

and (9) then implies that almost surely 

00 

lim lim sup > mass(G^' 4 ) 4 = . 

V-S>00 „_s.oo z — ' 

Hence, on this probability space, we have 

00 

lim limsupVd GH p(G7,^) 4 



iV->00 n _^oo 

i=N 

oo 

< 16 lim lim sup V (diam(G"' i ) 4 + diam(^) 4 + mass(G7) 4 + mass(^) 4 ) = 

N-^oo ^—^ 
i=N 
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almost surely. Combined with (11), this implies that in this space, almost surely 

lim dis4 HP (G^ A )=0. 

n— >oo 

The convergence of > 1) to > 1) follows from (9). 

If i is such that s(^) = 1 then, by (9), we almost surely have s(G^' J ) = 1 for all n 
sufficiently large. In this case, r((j"' 1 ) and r(&Q are the lengths of the unique cycles in G 7 ^ 1 
and in respectively. Now, G?"' 1 — > £fj almost surely in (.M , dcii), and it follows easily 
that in this space, r(G"' i ) — > r(SfJ) almost surely, for 2 such that s(^fj) = 1- 

Finally, by Theorem 4 of [49], min(r(G"' i ) : s(G^' 1 ) > 2) is bounded away from zero in 
probability. So by Skorokhod's representation theorem, we may assume our space is such 
that almost surely 

liminf min(r(G"' i ) : s(G n /) > 2) > 0. 

71— >00 

In particular, it follows from the above that, for any i > 1 with s(&{) > 2, there is almost 
surely r > such that £fj G A r and Gr™' 1 G A r for all n sufficiently large. Corollary 6.6 (i) 
then yields that in this space, r(G"' i ) — > r(^) almost surely. 

Together, the two preceding paragraphs establish the final claimed convergence. For 
completeness, we note that this final convergence may also be deduced without recourse to 
the results of [49] ; here is a brief sketch, using the notation of the previous lemma. It is easily 
checked that the points of the kernels of and Sfj[ correspond to the identified vertices 
(vi k ,Vj k ) and (x k , z k ), and those vertices of degree at least 3 in the subtrees of T n , T spanned 
by the points (vi k ,Vj k ),l < k < s and (x k ,z k ),l < k < s respectively. These trees are 
combinatorially finite trees (i.e., they are finite trees with edge-lengths), so the convergence 
of the marked trees (T n , v) to (T, x) entails in fact the convergence of the same trees marked 
not only by v, x but also by the points of degree 3 on their skeletons. Write v', x' for these 
enlarged collections of points. Then one concludes by noting that r(G^ 1 ) (resp. r(S£Q) is the 
minimum quotient distance, after the identifications (vi h ,Vj k ) (resp. (x k ,z k )) between any 
two distinct elements of v' (resp. x'). This entails that r{G r ^ t ) converges almost surely to 
r(&l) for each i > 1. □ 

The above description of the sequence ^ of random IR-graphs does not make the dis- 
tribution of the cores and kernels of the components explicit. (Clearly the kernel of ^ is 
only non-empty if s(^) > 2 and its core is only non-empty if s(5fj) > 1.) Such an ex- 
plicit distributional description was provided in [3], and will be partially detailed below in 
Section 5. 

4.2 Convergence of the minimum spanning forest 

Recall that M(n,p) is the minimum spanning forest of G(n,p) and that we write 

M™ = (M™'\ i > 1) 

for the components of M(n, l/n+ A/n 4//3 ) listed in decreasing order of size. For each i > 1 we 
write M™' 1 for the measured metric space obtained from M"' J by rescaling the graph distance 
by ?2 -1 / 3 and giving each vertex mass n~ 2 ^ 3 . We let 

M£ = (M A n 'V>l). 
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Recall the cutting procedure introduced in Section 3.2, and that for an IR-graph X, we write 
cut(X) for a random variable with distribution /C°°(X, •). For i > 1, if s(£f_J) = 0, let 
= *^\- Otherwise, let = cut(^), where the cutting mechanism is run independently 
for each i. We note for later use that the mass measure on is almost surely concentrated 
on the leaves of since this property holds for and ^ may be obtained from by 
making an almost surely finite number of identifications. 

Theorem 4.4. Fix A 6 R. Then as n — >■ oo, 

Ml A 



in the space (L 4 , distQ HP ). 
Proof. Write 



/ = sup{* > 1 : s(&l) > 1}, 



with the convention that 1 = when {i > 1 : s(^) > 1} = 0. Likewise, for n > 1 let 
-^n = > 1 : s^a'*) > 1}- We wor k in a probability space in which the convergence 
statements of Theorem 4.1 are all almost sure. In this probability space, by Theorem 5.19 
of [38] we have that I is almost surely finite and that J n — > I almost surely. 

By Theorem 4.1, almost surely r{G r ^ 1 ) is bounded away from zero for all % > 1. It follows 
from Theorem 3.3 that almost surely for every i > 1 we have 

d GH p(cut(G7),cut(^))^0. 

Propositions 3.4 and 3.5 then imply that we may work in a probability space in which almost 
surely, for every i > 1, 

d GHP (M A n ' l ,^)^0. (12) 

Now, for each i > 1, we have 

d G Hp( M \\^\) < 2max(diam(M"' i ),diam(^),mass(M A 1 ' i ),mass(^)). 

Moreover, for each i > I the right-hand side is bounded above by 

4max(diam(G^' i ), diam(^), mass(G A ' 1 ), mass(^)). 

Since I is almost surely finite, as in the proof of Theorem 4.1 we thus almost surely have 
that 



lim lim sup > dcHP (M?'' 

i=N 

oo 

< 64 lim limsupV (diam(G7) 4 + diam(^) 4 + mass(G"' i ) 4 + mass(^) 4 ) = 0, 



which combined with (12) shows that in this space, almost surely 

lim distG H p(M A l ,^ A ) = 0. □ 



n— >oo 
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4.3 The largest tree in the minimum spanning forest 

In this section, we study the largest component M™' 1 of the minimum spanning forest 
obtained by partially running Kruskal's algorithm, as well as its analogue G™' 1 for the random 
graph. It will be useful to consider the random variable A n which is the smallest number 
A G R such that G^' 1 is a subgraph of GJ 1 for every A' > A. In other words, in the race of 
components, A n is the last instant where a new component takes the lead. It follows from 
Theorem 7 of [47] that (A n ,n > 1) is tight, that is 

lim lim sup P (A" > A) = 0. (13) 

A— >oo n _ >oo 

(This result is stated in [47] for the other Erdos-Renyi random graph model, G(n, m), rather 
than G(n,p), but it is standard that results for the former have equivalents for the latter; 
see [38] for more details.) 

In the following, if i h> f(x) is a real function, we write f(x) = oe(x) if there exist 
positive, finite constants c, c', e, A such that 

\f{x)\ < cexp(— c'x e ) , for every x > A. 

In the following lemma, we write dn(M™' 1 , M n ) for the Hausdorff distance between M"' 1 and 
M n , seen as subspaces of M n . Obviously, d GH (M"' 1 , M n ) < d H (M"'\ M n ). 

Lemma 4.5. For any e G (0, 1) and Ao large enough, we have 

lim sup P ( d H (M A n '\M n ) > -L A" < A ) = oe(A) . 
n— >oo \ Ay 

In the course of the proof of Lemma 4.5, we will need the following estimate on the 
length of the longest path outside the largest component of a random graph within the 
critical window. 



Lemma 4.6. For all < e < 1 there exists Ao such that for all A > Ao and all n sufficiently 

aside from G"' 



large, the probability that a connected component of G? aside from G?' 1 contains a simple 



path of length at least n 1 / 3 /A 1 e is at most e xt/2 . 

The proof of Lemma 4.6 follows precisely the same steps as the proof of Lemma 3 (b) of 
[5], which is essentially the special case e = 1/2. 1 Since no new idea is involved, we omit the 
details. 

Proof of Lemma 4-5. Fix fo > and for i > 0, let /j = (5/4)' • fo- Let t = t(n) be the 
smallest i for which /j > n 1 / 3 / logn. Lemma 4 of [5] (proved via Prim's algorithm) states 
that 

E [diam(M n ) - diam(M^; 1 )] = 0(n 1/6 (logn) 7/2 ); 
this is established by proving the following stronger bound, which will be useful in the sequel: 

P (d H (M;; 1 ,M n ) > n- 1/6 (logn) 7/2 ) < \ (14) 



4 In [5] it was sufficient for the purpose of the authors to produce a path length bound of n 1 / 3 /A 1 / 2 , but 
their proof does imply the present stronger result. For the careful reader, the key point is that the last 
estimate in Theorem 19 of [5] is a specialisation of a more general bound, Theorem 11 (hi) of [48]. Using the 
more general bound in the proof is the only modification required to yield the above result. 
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Let Bi be the event that some component of G™. aside from G^ contains a simple path with 
more than n 1 / 3 / fl~ e edges and let 

I n = max{z < t : Bi occurs}. 

Lemma 4.6 entails that, for / sufficiently large, for all n, and all < i < t — 1, 



P (i < I n < t) < e~ f * /2 < 2e 



-f: 



/2 



where the last inequality holds for all i sufficiently large. For fixed i < t, if A" < /j then for 
all A G [fi, f t ] we have 

d H (M^ 1 ,M;; 1 )<d H (M;; 1 ,M;; 1 ). 
If, moreover, I n < i, then we have 



d H (M; ; \M; ; 1 ) < £ < ^f' < io/ t 



£-1 



(4/5) 1 



e-l 



(15) 



the latter inequality holding for e < 1/2. 

Finally, fix A G R and let z = ^o(^) be such that A G [fi ,fi +i)- Since / t — >■ oo as 
n — > oo, we certainly have iq <t for all n large enough. Furthermore, 



A n < A f 



p(d H (K'\M n )>^ 
< P ^d H (M"' 1 , m;; 1 ) > i JL I A" < A ) + P (d H (M; ; \ M n ) > ^ 

P^d H (M-\M;; 1 )>^ + ij, 



A n > A f 



- P (A n < A 

for all A large enough and all n such that 2A < n 1//6 (logn)~ 7 / 2 , by (14). It then follows from 
(15) and the tightness of (A n ,n > 1) that there exists a constant C G (0,oo) such that for 
all A large enough, 



P^d H (M"' 1 ,M") > -L 



A" < An < 



P (A 71 < A 

12 l 

n 



P(io(A)</ n <t) + 



n 



< C I e f *o(K 



Letting n tend to infinity proves the lemma. 



□ 



We are now in a position to prove a partial version of our main result. In what follows, we 
write M n , M™' 1 and for the metric spaces obtained from M n , M^' 1 and by ignoring 
their measures. 

Lemma 4.7. There exists a random compact metric space ^ such that, as n — > oo ; 



Moreover, as A —> oo, 



M n AjZ in(M,d GR ). 
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Proof. Recall that the metric space (A4, cIgh) is complete and separable. Theorem 4.4 entails 
that 

as n — > oo in (A^,dcH)- The stated results then follow from this, Lemma 4.5 and the 
principle of accompanying laws (see Theorem 3.1.14 of [70] or Theorem 9.1.13 in the second 
edition). □ 

Let M™' 1 be the measured metric space obtained from M"' 1 by rescaling so that the 

n,l ____ i„ --2/3. 



total mass is one (in M™' we gave each vertex mass n ' , now we give each vertex mass 

l^(Mf)!- 1 ). 

Proposition 4.8. For any e > 0, 



lim lim sup P (d GRP (M^\ M n ) > e) = 0. 



In order to prove this proposition, we need some notation and a lemma. Let F™ be the 
subgraph of M n with edge set E(M n ) \ J B(M"' i ). Then F^ is a forest which we view as rooted 
by taking the root of a component to be the unique vertex in that component which was an 
element of M"' 1 . For v G V^M*^' 1 ), let S™(v) be the number of nodes in the component F"(v) 
of F" rooted at v. The fact that the random variables (S%(v), v e ^(M^' 1 )) are exchangeable 
will play a key role in what follows. 

Lemma 4.9. For any 5 > 0, 

lim lim sup P max S™(v) > 5n 1 =0. (16) 

A^oo n^oo \ t)eV(M"' 1 ) 

Proof. Let Z7" be the event that vertices 1 and 2 lie in the same component of F™. Note 
that, conditional on max^y^ij S™{v) > 5n, the event U™ occurs with probability at least 

S 2 /2 for sufficiently large n. So, in order to prove the lemma it suffices to show that 

lim lim sup P^) = 0. (17) 

A— >co n 

In order to prove (17), we consider the following modification of Prim's algorithm. We 
build the MST conditional on the collection M" of trees. We start from the component 
containing vertex 1 in M". To this component, we add the lowest weight edge connecting it 
to a new vertex. This vertex lies in a new component of M™, which we add in its entirety, 
before again seeking the lowest-weight edge leaving the tree we have so far constructed. We 
continue in this way until we have constructed the whole MST. (Observe that the components 
we add along the way may, of course, be singletons.) Note that if we think of Prim's algorithm 
as a discrete-time process, with time given by the number of vertices added so far, then this 
is simply a time-changed version which looks only at times when we add edges of weight 
strictly greater than 1/n + A/n 4 / 3 . This is because when Prim's algorithm first touches 
a component of M™, it necessarily adds all of its edges before adding any edges of weight 
exceeding 1/n+A/n 4 / 3 . For i > 0, write Ci for the tree constructed by the modified algorithm 
up to step i and let be the edge added at step i. The advantage of the modified approach 
is that, for each i > 1, we can calculate the probability that the endpoint of which does 
not lie in Cj_i touches M"' 1 , given that it has not at steps 0,1,...,? — 1. Recall that, at each 



28 



stage of Prim's algorithm, we add the edge of minimal weight leaving the current tree. We 
are thinking of this tree as a collection of components of M™ connected by edges of weight 
strictly greater than 1/n + A/n 4 / 3 . In general, different sections of the tree built so far are 
subject to different conditionings depending on the weights of the connecting edges and the 
order in which they were added. In particular, the endpoint of contained in Cj_i is more 
likely to be in a section with a lower weight-conditioning. However, the other endpoint of 
Ci is equally likely to be any of the vertices of {1, 2, . . . , n} \ Cj_i because all that we know 
about them is that they lie in (given) components of M". 



Formally, let k = n — 1 - 
Recursively, for 1 < i < k, let 



El 



Let Co be the component containing 1 in 



ti be the smallest-weight edge leaving Cj_i and 

Ci be the component containing 1 in the graph with edge-set i?(M'^) U {e 1; 



The graph with edge-set E(M.™) U {ei, . . . , e&} is precisely M n . Let I\ be the first index for 
which ^(M^' 1 ) C V(C/ 1 ), so that I\ is the time at which the component containing 1 attaches 



to M"' . For each 1 < % < k, the endpoint of not in Ci_i is uniformly distributed among 
all vertices of {1, . . . , n} \ C»_i. So, conditionally given M", e\, . . . , ej_i and on {I\ > i}, the 
probability that Ii takes the value i is |V(M^' )\/{n — | V(C i _ 1 ) |). Therefore, 



< 



\V{Mf)\ 



n 



P (J,. > % 1 1 

By Theorem 2 of [53] (see also Lemma 3 of [47]), for all 5 > 

\V(Mf)\ 



18) 



lim limsupP 



> s 



0. 



2An 2 / 3 

Using (18) and (19), it follows that for any 5 > 0, there exists B > such that 

lim lim sup P (h > Bn 1/3 /\) <5. 



A— >oo 



(19) 



(20) 



Next, let Z be a uniformly random element of {1, . . . ,n}\ V(M™' ), and let L\ be the size 
of the component of M" that contains Z. Theorem Al of [37] shows that 



lim lim sup E 

A >-OC ^ — v-^ 



£^(Mf)l 2 



11 



4/3/ A 



< oo, 



which implies that 



lim lim sup E 

A ^oo r , — 



< OO. 



nV3/A_ 

Foreachi > 1, given that i < J 1; the difference | V(Ci) | — | V(C i _ 1 ) | is stochastically dominated 
by L A , so that 

ED^Cj^Oll^j] <z'E[L A ] . 
By (20) and Markov's inequality, there exists B' > such that 



lim limsupP(|V(C 7l _i)| > B'n 2/3 /X 2 ) < 5. 



A— >oo 



(21) 
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The graph C* with edge-set E(Ci 1 -x)U{ej 1 } forms part of the component containing 1 in F^; 
indeed, the endpoint of ei 1 not contained in Ci x -\ is the root of this component. Write V\ for 
this root vertex. Now consider freezing the construction of the MST via the modified version 
of Prim's algorithm at time l\ and constructing the rest of the MST using the modified 
version of Prim's algorithm starting now from vertex 2. Let I = n — 1 — — I±. 

Let D be the component containing 2 in the graph with edge-set E(M%) U {e^,. . . ,6^}. 
Recursively, for 1 < j < £, let 

• fj be the smallest-weight edge leaving Z?j_i and 

• Dj be the component containing 2 in the graph with edge-set E(M%) U {e^, . . . , e^} U 



first index for which fj 1 has an endpoint in V(C*). 

Recall that U™ is the event that 1 and 2 lie in the same component of F™. If U'£ occurs 
then we necessarily have J\ < I 2 - To prove (17) it therefore suffices to show that 



In order to do so, we first describe how the construction of Cj 1 conditions the locations of 
attachment of the edges ji. As in the introduction, for e G E(K n ), W e is the weight of edge 
e, and unconditionally these weights are i.i.d. Uniform[0, 1] random variables. 

Write A = V(C ), and for 1 < % < I u let M = V(d) \ V(Ci_i). In particular, 
Ai x = ^(M^' 1 ). After Cj 1 is constructed, for each < i < I\, the conditioning on edges 
incident to Ai is as follows. 

(a) Every edge between V(Ci-i) and Ai has weight at least W ei . 

(b) For each % < j < Ji, every edge between A, and [n] \ V{Cj) has weight at least 
maxjW 7 ^, i < k < j}. 

In particular, (b) implies all edges from Ai to [n] \ V(CjA are conditioned to have weight at 
least ma,x{W ej , i < k < This entails that components which are added later have lower 
weight-conditioning. In particular, there is no conditioning on edges from Aj x = ^(M"' 1 ) 
to [n] \ V(CjA (except the initial conditioning, that all such edges have weight at least 
1/n + A/n 4//3 , which comes from conditioning on M"). 

It follows that under the conditioning imposed by the construction of Cj 1( it is not the 
case that for 1 < j < £, the endpoint of fj+\ outside Dj is uniformly distributed among 
{1, . . . ,n} \ Dj. However, the conditioning precisely biases these endpoints away from the 
sets Ai with i < Ii (but not away from Aj 1 = ^(M^' 1 )). As a consequence, for each 
1 < j < ^, conditional on the edge set E^M") U {ei, . . . , e/ x } U {/i, . . . , fj} and on the event 
{Ji > j} n {h > j}, the probability that j = I 2 is at least (^(M™' 1 )] - l)/(n - |V(IVi)l) 
and the probability that j = J\ is at most \V(C*)\/(n — \V(Dj-i)\). Hence, 




lim limsupP(Ji < I 2 ) = 0. 



(22) 




and so, by (19), we obtain that for any 5 > there exists B" > such that 



A— >oo n _j.oo 



lim lim sup P (J 2 > B"n 1/3 /\) < 5. 



(23) 
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Moreover, 



F(J 1 >i\ Ml) > 1 



n-|HE>Ji-i) 

Note that |V(C*)| = | V(C/ 1 _i) | + 1. Also, just as for the components Ci, given that i < J%, 
the difference | V(Z?,) j — |V"(£)j_i)| is stochastically dominated by L\ and so we obtain the 
analogue of (21): there exists B'" such that 

lim limsupP ({ViDj^l > B"'n 2 ^/\ 2 ) < 5. 

A— s>oo n _ >OQ 

Hence, from this and (21), we see that there exists B"" such that 

lim lim sup P (j x < X 2 n 1/3 /B"") < 5. (24) 

Together, (23) and (24) establish (22) and complete the proof. □ 

Armed with this lemma, we now turn to the proof of Proposition 4.8. 

Proof of Proposition 4-8. Fix e > and let N™ be the minimal number of open balls of 
radius e/4 needed to cover the (finite) space M n . This automatically yields a covering of 
M"' by open balls of radius e/4 since M"' is included in M n . From this covering, we 
can easily construct a new covering B™' 1 , . . . , B™' e of M^' 1 by sets of diameter at most e/2 
which are pairwise disjoint. Let 



and let C A = Ut=i(-^A * x ^a'*) > which defines a correspondence between M"' and M n . 
Moreover, its distortion is clearly at most 2dn(M^' 1 , M n ) + e. Therefore, by Lemma 4.5, 

lim lim sup P (dis(CJ) > 2e) = 0. (25) 



Next, write = |K(M™ ,:L )| and take an arbitrary relabelling of the elements of ^/(M"' 1 ) 
by {1, 2, ... , V*}. Since • • • , (V*")) are exchangeable, Theorem 16.23 of Kallen- 

berg [40] entails that for any 5 > 0, 



lim lim sup P max 

A->oo j^oo \ 1<«<VX' 

as soon as we have that for all 5 > 0, 



n V7 1 
i=i A 



><N = (26) 



lim lim sup P I max S™(i) > 5n I = 



which is precisely the content of Lemma 4.9. 
Now define a measure 7r n on M™' x M n by 



V£\BY\ n\B^ 
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Note that 7r"((C™) c ) = by definition. Moreover, the marginals of ir n are given by 

1 



*&({«}) = ™ A 



V x n\B 



., l<i<K, 



and 



7r (2)(W) = -A 



n V£\B^ l \ 



v e B n / , l<i<N e n 



Therefore, the discrepancy D(ir™) of tt^ with respect to the uniform measures on M n and 



M^' 1 is at most 



N? max 

Ki<N" 



\B7\ 



\B7\ 



n 



which (by relabelling the elements of M^' 1 so that the vertices in each B^' 1 have consecutive 
labels and using exchangeability) is bounded above by 



2JV" max 



Ki<V. 



X 



E 



n V? 



Then 



P(d G Hp(M A n,1 5 ^ n ) >^ 

< P (dis(CJ) > 2e) + P PK) > e) 

< P (dis(Cy) > 2e) + P ( iV™ max 



E 



s n x U) 



< P (dis(CjJ) > 2e) + P max 



Ki<V. 



A 



E 



n V7 



> 



n V" 



> — |+P(JV?>if) 



But now recall that iV™ is the minimal number of open balls of radius e/4 needed to cover 

M n . Let N e be the same quantity for j$ . Then by Lemma 4.7, M n -4 which easily 
implies that lim sup^^ P (iV™ > K) < P (JV e > K). In particular, by (25) and (26) 

r n,l 



□ 



lim limsupP (d GH p(M"' 1 ,M n ) > e) < P (JV £ > 

A— >00 r;,— >qo 



and the right-hand side converges to as K — > oo. 



Let be the measured metric space obtained from by renormalising the measure 
to be a probability. 

Theorem 4.10. There exists a random compact measured metric space jtft of total mass 1 
such that as n — )■ oo ; 

in the space (A^dcHp)- Moreover, as A — > oo, 



in the space (A^,dcHp)- Finally, writing jtft = (X,d,n), we have (X,d) = ^# in (A^,dcH); 
where is as in Lemma 4-7. 
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Proof. Recall that the metric space (-M,c1ghp) is a complete and separable. Theorem 4.4 
entails that 

as n — > oo in (-M,c1ghp)- The stated results then follow from this, Proposition 4.8 and the 
principle of accompanying laws (see Theorem 3.1.14 of [70] or Theorem 9.1.13 in the second 
edition). □ 

Finally, we observe that, analogous to the fact that M^' 1 is a subspace of M n , we can 
view as a subspace of j$ . (We emphasize that this does not follow from Theorem 4.10.) 
To this end, we briefly introduce the marked Gromov-Hausdorff topology of [52, Section 6.4]. 
Let be the set of ordered pairs of the form (X, Y), where X = (X, d) is a compact metric 
space and Y C X is a compact subset of X (such pairs are considered up to isometries of X) . 
A sequence (X n , Y n ) of such pairs converges to a limit (X, Y) if there exist correspondences 
C n G C(X n ,X) whose restrictions to Y n x Y are correspondences between Y n and Y, and 
such that dis(C„) — > 0. (In particular, this implies that Y n converges to Y for the Gromov- 
Haudorff distance, when these spaces are equipped with the restriction of the distances on 
X n ,X.) Moreover, a set A C M.* is relatively compact if and only if {X : (X, Y) G A} is 
relatively compact for the Gromov-Hausdorff topology. 

Recall the definition of the tight sequence of random variables (A n ,n > 1) at the be- 
ginning of this section. By taking subsequences, we may assume that we have the joint 
convergence in distribution 

(((M", M^' 1 ), A G Z), A") A (((uT, Jt\\ A G Z), A) , 
for the product topology on .Mj x IR. 5 This coupling of course has the properties that 



^# = j£ and that j£\ = ^\ for every A G Z. Combining this with Lemma 4.5 we easily 



obtain the following. 

Proposition 4.11. There exists a probability space on which one may define a triple 

(4,Ae Z),A) 

with the following properties : (i) A is an a. s. finite random variable; (ii) j# — , = 
and {ytft G M.* for every A G Z; and (Hi) for every e G (0, 1) and Ao > large enough, 

P (d H (~#, Ji{) > X'- 1 A < A ) = oe(A) . 
In particular, {y£ , — >■ M) as A — > oo for the marked Gromov-Hausdorff topology. 



5 Properties of the scaling limit 

In this section we give some properties of the limiting metric space . We start with some 
general properties that ^ shares with the Brownian CRT of Aldous [12, 13, 14]: 

Theorem 5.1. M is a measured M.-tree which is almost surely binary and whose mass 
measure is concentrated on its leaves. 

s This is a slight abuse of notation, in the sense that the limiting spaces ^# on the right-hand side should, 
in principle, depend on A, but obviously these spaces are almost surely all isometric. 
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Proof. By the second distributional convergence in Theorem 4.10, we may (and will) work in 
a space in which we almost surely have liniA^oo dcHP ^0 = 0. Since it is the Gromov- 
Hausdorff limit of the sequence of IR-trees ^# is itself an R-tree (see for instance [27]). 
For fixed AgM, each component of is obtained from 5^, the scaling limit of G^, using 
the cutting process. From the construction of ^\ detailed in Section 4.1, it is clear that 
almost surely does not contain points of degree more than three, and so ^\ is almost surely 
binary. 

Next, let us work with the coupling (^#, (^^,A G Z)), of Proposition 4.11. We can 
assume, using the last statement of this proposition and the Skorokhod representation the- 
orem, that (^#, — > (..#, a.s. in M.^. Now suppose that ^ has a point x of degree 
at least 4 with positive probability. On this event, we can find four points x%, X2, x%, 24 of 
the skeleton of each having degree 2, and such that the geodesic paths from Xi to xq 
have strictly positive lengths and meet only at Xq. But for A large enough, xq, xi, ... all 
belong to , as well as the geodesic paths from x±, . . . , x 4 to x . This contradicts the fact 
that is binary. Hence, j$ is binary almost surely. 

Let x and x x be sampled according to the probability measures on j$ and on 
respectively. For the remainder of the proof we abuse notation by writing (^, x) and 
(^l,x x ) for the marked spaces (random elements of Ai 1,1 ) obtained by marking at the 
points x and x x . Then we may, in fact, work in a space in which almost surely 

lim d^Hp((^, x), {J%{, x x )) = . 

A— >OD 

As noted earlier, the mass measure on ^\ is almost surely concentrated on the leaves of 
and it follows that for each fixed A, x x is almost surely a leaf. Let 

A(x) = sup {min(/ _1 (x), t — f~ (x)) for / : [0,i] — > M a geodesic with x G Im(/)| , 

so, in particular, A(x) = precisely if x is a leaf. For each fixed A, since x x is almost surely 
a leaf, it is straightforward to verify that almost surely 

d^ P ((^,x),(^ A 1 ,x A ))> A(x)/2. 

But then taking A — > 00 along any countable sequence shows that A (a;) = almost surely. □ 

To distinguish jtft from Aldous' CRT, we look at a natural notion of fractal dimension, 
the Minkowski (or box-counting) dimension [29]. Given a compact metric space X and r > 0, 
let iV(X, r) be the minimal number of open balls of radius r needed to cover X. 

We define the lower and upper Minkowski dimensions by 

dim M (X) = liminf f 777-^— and diniM(X) = lim sup 



r±o log(l/r) " ri0 log(l/r) 



If dim M (X) = dimM(X), then this value is called the Minkowski dimension and is denoted 
dim M (X). 

Proposition 5.2. The Minkowski dimension of ^ exists and is equal to 3 almost surely. 

Since the Brownian CRT satisfies dim M (^) = 2 almost surely ([25, Corollary 5.3]), 
we obtain the following result, which gives a negative answer to a conjecture of Aldous [8]. 
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Corollary 5.3. For any random variable A > 0, the laws of ^ and of AS? , the metric 
space ST "with distances rescaled by A, are mutually singular. 

The proof relies on an explicit description of the components of &\, given in [3]. We only 
give a partial statement, since that is all that we need here. Note that, given s(£f A ) = k > 2, 
the kernel ker(£f A ) is a 3-regular multigraph with 3k — 3 edges and hence 2(k — 1) vertices. 
Fix A G R and k > 2, and K a 3-regular multigraph with 3k — 3 edges. Label the edges of 
K by {1, 2, . . . , 3k — 3} arbitrarily. 

Construction 1. Independently sample random variables ~ Gamma((3/c — 2)/2,l/2) 
and (Yi,Y 2 , . . . , Y^^^) ~ Dirichlet(l, 1, . . . , 1). Attach a line-segment of length Yj y/ oT^ 
in the place of edge j in K, for 1 < j < 3k — 3. 

Construction 2. Sample (X 1; X 2 , ■ ■ ■ , X 3fc _ 3 ) ~ Dirichlet(l/2, 1/2, . . . , 1/2) and, given 
(Xi, . . . ,X 3k _ 3 ), let . . . , £T( 3k ~ 3 )) be independent CRT's with masses given by 

(crXi, . . . ,aX 3 ks) respectively. For 1 < i < 3k — 3, let (x^x^) be two independent 
points in ^"W, chosen according to the normalized mass measure. Take the metric 
gluing of 1 < i < 3k — 3) induced by the graph structure of K, by viewing Xj, x\ 

as the extremities of the edge i. 

Here we should recall some of the basic properties of the CRT referring the reader 
to, e.g., [43] for more details. If e = (e(s),0 < s < 1) is a standard normalized Brownian 
excursion then is the quotient space of [0, 1] endowed with the pseudo- distance d E (s,t) = 
2(e(s) +s(t) — 2 inf sA i< u < sVt s(u)), by the relation {d E = 0}. It is seen as a measured metric 
space by endowing it with the mass measure which is the image of Lebesgue measure on 
[0, 1] by the canonical projection p : [0, 1] —> £7 '. It is also naturally rooted at the point p(0). 
Likewise, the CRT with mass a, denoted by is coded in a similar fashion by (twice) 
a Brownian excursion conditioned to have duration a. By scaling properties of Brownian 
excursion, this is the same as multiplying distances by \fa in 3" , and multiplying the mass 
measure by a. 

Proposition 5.4. The metric space obtained by Construction 1 (resp. Construction 2) has 
same distribution as core(^) (resp. &x), given mass(^ A 1 ) = a, s(£f A ) = k and ker(^) = K . 

The proof of Proposition 5.2 builds on this result and requires a couple of lemmas. Recall 
the notation from Section 3.2. 

Lemma 5.5. Let X = (X, d, x) be a safely pointed M>-graph and fix r > 0. Then iV(X, r) < 
N(X x ,r) <iV(X,r) + 2. 

This lemma will be proved in Section 7.1, where we give a more precise description of X x . 
The next lemma is a concentration result for the mass and surplus of £f A x . This should be 
seen as a continuum analogue of similar results in [47, 53]. We stress that these bounds are 
far from being sharp, and could be much improved by a more careful analysis. In the rest 
of this section, if (Y(\), A > 0) is a family of positive random variables and (/(A), A > 0) is 
a positive function, we write Y(X) x /(A) if for all a > 1, 

P(F(A)^[/(A)/a,a/(A)]) = oe(A). 

Note that this only constrains the above probability for large A. 
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Lemma 5.6. It is the case that 

9\3 

mass^ 1 ) x 2A and s(^) x — . 

Proof. We use the construction of £f\ described in Section 4.1. Recall that (W(t),t > 0) is 
a standard Brownian motion, that W\(t) = W(t) + Xt - t 2 /2, and that B x (t) = W\(t) - 
mino< s <t W\(s). Note that, letting 

A\ = {\W(t)\ < (2A) V t for all t > 0} , (27) 

we have F(A X ) = oe(A). Considering first t < 2A, by symmetry, the reflection principle and 
scaling we have that 

P ( sup \W{t)\ > A J < 2P ( sup > A j 

\0<t<2A / \0<i<2A / 

= 2P(|W(2A)| > A) 
= 2f(\W(2)\ > y/\) , 

and this is oe(A) since W(2) is Gaussian. Turning to t > 2A, note that letting W = 
(W(u + 2A) — W(2X),u > 0), then W is a standard Brownian motion by the Markov 
property. Hence, on the event {sup 0<t<2A < A}, the probability that |W(£)| > t 

for some t > 2A is at most P (3u > 6~\W'(u)\ > u + A) < 2P (m&x u > (W'(u) - u) > A). 
We deduce that P (A c x ) = oe(A) from the fact that ma.x u > (W'(u) — u) has an exponential 
distribution, see e.g. [64]. 
On A x , 

- t - + Xt- ((2A) Vt)< W x (t) <- t - + Xt + (2A) Vt, t>0, 

from which it is elementary to obtain that if A > 4, the following properties hold. 

(i) The excursion e of B\ that straddles the time A has length in [2A — 8, 2A + 8]. 

(ii) All other excursions of B\ have length at most 6. 

(iii) The area of e is in [2A 3 /3 - 4A 2 , 2A 3 /3 + 8A 2 ]. 

Note that (i) and (ii) imply that, for A > 8, on Aj, the excursion e of B\ is the longest, 
which we previously called e 1 , and which encodes the component SfJ of Sf\- This implies that 
mass(^) x 2A, since mass(^) is precisely the length of e 1 . Finally, recall that, given e 1 , 
s(&x) has a Poisson distribution with parameter equal to the area of e 1 . Therefore, standard 
large deviation bounds together with (iii) imply that s(^) x 2A 3 /3. □ 

Proof that dim M (^#) > 3 almost surely. In this proof, we always work with the coupling 
from Proposition 4.11, but for convenience omit the decorations from the notation, e.g., 
writing ^ in place of ^# or of In particular, this allows us to view as a subspace 
of ^ for every A G Z. 

Since ^\ is obtained from SfJ by performing the cutting operation of Section 3.2, 
Lemma 5.5 implies that for every r > 0, 

P (iV(^#, 1/A) < r) < P (N(^ x \ 1/A) < r) < P (iV(Sf A \ 1/A) < r) . (28) 
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Next, by viewing core(£f A ) as a graph with edge-lengths, we obtain that N(&£,l/\) is at 
least equal to the number N'(l/X) of edges of core(£f A ) that have length at least 2/A, since 
the open balls with radius 1/A centred at the midpoints of these edges are pairwise disjoint. 

Now fix a > 0, k > 2 and a 3-regular multigraph K with 3k — 3 edges, and recall the 
notation of Construction 1. Given that mass(£f A ) = a, s(^ A ) = k and ker(£f A ) = K, the 
edge-lengths of core(£f A ) are given by Yi^/aY k , 1 < i < 3k — 3, and we conclude that (still 
conditionally) 

jV'(l/A) = \{i G {1, . . . , 3k - 3} : Y t ^faY k > 2/X}\ . 

Note that this does not depend on K but only on o and on k. Now T k ~ Gamma((3/c — 
2)/2,l/2) can be represented as the sum of 3k — 2 independent random variables with 
distribution Gamma(l/2, 1/2), which have mean 1, and by standard large deviation results 
this implies that 

sup P (T k < A 3 ) = oe(A) . 

fce[A 3 /2,A 3 ] 

Hence, by first conditioning on mass(£f A ), s(^ A ) and using Lemma 5.6, for any given c > 0, 
P (JV'(1/A) < cA 3 ) < sup P (JV'(1/A) < cA 3 | mass^ 1 ) = a, s(^) = k) + oe(A) 

fcG[A 3 /2,A 3 ] 

< sup Pf|{« G {l,...,3fc-3} : Y iy ^T~ k > 2/A}| < cA 3 ) + oe(A) 

a>A ^ ' 

fc£[A 3 /2,A 3 ] 

< sup P(|{z G {l,...,3fc-3} : Y { > 2/A 3 }| < cA 3 ) + oe(A) 

fee[A 3 /2,A 3 ] 

We now use that (Yi, . . . , ^-3) ~ Dirichlet(l, . . . , 1) is distributed as (71, . . . , ^-3) / (li + 
• • • +73/C-3), where 71, ... , 73*1-3 are independent Exponential (1) random variables. Standard 
large deviations results for gamma random variables imply that 

sup P (71 + . . . + 7 3fc -3 > 4A 3 ) = oe(A) . 

fce[A 3 /2,A 3 ] 

From this we obtain 

sup P G {1, . . . , 3k - 3} : Yi > 2/A 3 }| < cA 3 ) 

fee[A 3 /2,A 3 ] 

< sup P (\{i G {1, . . . , 3fc — 3} : 7i > 8}| < cA 3 ) + oe(A) 

fce[A 3 /2,A 3 ] 

and this is oe(A) for c < e~ 8 , since \{i G {1, ... ,3k — 3} : 7^ > 8}| is Bin(3A; — 3,e -8 ) 
distributed. 

It follows that for such c, P(A r '(l/A) < cA 3 ) = oe(A), which with (28) implies that 

P (N(^, 1/A) < cA 3 /2) = oe(A) . 

We obtain by the Borel-Cantelli Lemma that N(^, 1/A) > cA 3 /2 for all A G Z sufficiently 
large. By sandwiching 1/r between consecutive integers, this yields that almost surely 



log N(^,r) 
r->6 log(l/r) 



dim M (^) = liminf — ; — , { — > 3 



□ 
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We now prove the upper bound from Proposition 5.2. 



Proof that diniM(^) < 3 almost surely. Recall the definition of A from Proposition 4.11. 
Fix Ao > and an integer A > Ao- We work conditionally on the event {A < Ao}. Next, fix 
e > 0. If Bi, . . . , B N is a covering of M\ by balls of radius 1/A 1_e then, since J£\ C the 
centres Xi, . . . ,x^ of these balls are elements of M . On the event {dn(^£,^) < 1/A 1_e }, 
whose complement has conditional probability oe(A) by Proposition 4.11, the balls with 
centres x\, . . . , xn and radius 2/A 1_e then form a covering of Hence 

P W*, 2/A-) > 5A3 | A < A ) < PW t ( 'S >5A ' ) + -(A) 

P(A r (g»,l/A 1 -') + 2 S (» > 1 ) >5A 3 ) 

P(A<A„) + ° elA ' 

P(A < A ) V ; V ; 

where in the penultimate step we used Lemma 5.5 and the fact that is obtained from 
3fjJ- by performing s(£f A ) cuts, and in the last step we used the fact that s(£^ A ) x 2A 3 /3 from 
Lemma 5.6. 

To estimate N(&£, 1/A 1_e ), we now use Construction 2 to obtain a copy of £f A condi- 
tioned to satisfy mass(£f A ) = a, s(^) = k and ker(£f A ) = K, where A" is a 3-regular multi- 
graph with 3k — 3 edges. Recall that we glue 3k — 3 Brownian CRT's ■ ■ ■ > ^jx 3k _ 3 ) 
along the edges of K. These CRT's are conditionally independent given their masses 
aXi, . . . , aX 3 k-3, and (Xi, . . . , X^ks) has Dirichlet(l/2, . . . , 1/2) distribution. (Here we 
include the mass in the notation because it will vary later on.) If each of these trees has 
diameter less than 1/A 1_<E , then clearly we can cover the glued space by 3k — 3 balls of radius 
l/A 1-6 , each centred in a distinct tree 3"}x., 1 — * — 3k — 3. Therefore, by first conditioning 
on mass(£f A ) and on s(£^ A ), and then using Lemma 5.6, 

F(N{^,l/X 1 - £ ) > 3A 3 ) 

< sup P (N(&£, l/A 1 ^) > 3A 3 | mass^ 1 ) = a, s(^{) = k) + oe(A) 

cr<3A 

fee[A 3 /2,A 3 ] 



< sup P max diam(^ ) > 1/A 1 " 6 + oe(A) (30) 

cr<3A \l<i<3fe-3 1 J 

fce[A 3 /2,A 3 ] 

We can represent (X x , . . . , X 3k _ 3 ) as (71, . . . , 73^-3) /(ti + • • • + 73/0-3), where 71, ... , 73/0-3 
are i.i.d. random variables with distribution Gamma(l/2, 1). Hence 

P f max Xi > 1/A 3 ~ e ] < P (71 + • • • + 73fc-3 < A 3 ~ e/2 ) + P f max 7 > A e/2 

\l<i<3fc-3 J \l<t<3fc-3 

< P (71 + • • • + 73,-3 < A 3 -/ 2 ) + 1 - (1 - P (71 > A^ 2 )) 3 ^ 3 . 
Standard large deviations results for gamma random variables then entail that for all e > 0, 

sup P ( max X % > 1/A 3_e j = oe(A) , 

fce[A3/2,A3] \l<i<3/o-3 J 



38 



which in turn implies that 



sup P( max diam(5; ( J ) > l/A 1 ^ J 

<r<3A \l<i<3fc-3 ' J 



fce[A 3 /2,A 3 ] 

^ ' max X, > 1/A 3 ~ £ ] +P I max diam(<S, .. ., 

l<i<3fc— 3 / Vl<j<3A 3 ' A " 



fce[A 3 /2,A 3 ] 



< sup P I max Xj>l/A 3 {E )+P( max 5 diam(^ 2 _ e ) > 1/A 1_£ 

where we used that, by scaling, diam(e5^-) is stochastically increasing in a, and (^^ 2 _ e ), 1 < 
i < |_3 A 3 J are independent CRT's, each with mass 3/A 3_<E . Using this bound and Brownian 
scaling, it follows that 



sup P ( max diam(«^g ) > 1/A 1 ^ J < oe(A) + l-(l - P fdiam(^ > A e/2 /V3) 

<r<3A \l<i<3k-3 J V V 



3A 3 



fce[A 3 /2,A 3 ] 

(31) 

Next, it is well-known that the height of that is, the maximal distance from the root to 
another point, is theta-distributed: 



P(height(^) >x) = Y^{-l) k+1 e~ k2x2 < e 



k>l 



Since diam(^) < 2 height it follows that 

P(diam(^) > x) = oe(x) . 

We obtain that (31) is oe(A), and (30) then yields that P (JV(£f A \ 1/A 1 ^) > 3A 3 ) = oe(A). 
By (29), we then have 

P {N{Jt, 2/X 1 ' t ) > 5A 3 | A < A ) = oe(A) . 

Therefore, the Borel-Cantelli Lemma implies that N(^f, 2/A 1_e ) < 5A 3 a.s. for every integer 
A > Ao large enough. This implies that, conditionally on {A < Ao}, dimM(^) < 3 + e almost 
surely for every e > 0, by sandwiching 1/r between integers in lim sup r ^ log N(^#, r) / log(l/r) 
Since A is almost surely finite and Ao was arbitrary, this then holds unconditionally for any 
e > 0. □ 

This concludes the proof of Proposition 5.2. 

6 The structure of IR-graphs 

In this section, we investigate IR-graphs and prove the structure theorems claimed in Sec- 
tion 2.2. 

6.1 Girth in M-graphs 

In this section, X = (X, d) is an IR-graph. The girth of X is defined by 
gir(X) = inf{len(c) : c is an embedded cycle in X} . 



39 



If (X,d) is an IR-graph, then by definition (B t ^(x),d) is an IR-tree for every x G X and 
for some function e : X — > (0, oo). The balls (B e ^(x),x G X) form an open cover of X. 
By extracting a finite sub-cover, we see that there exists e > such that for every x G X, 
the space (B e (x),d) is an IR-tree. We let -R(X) be the supremum of all numbers e > with 
this property It is immediate that gir(X) > 2R(X) > 0. In fact, it is not difficult to show 
that gir(X) = 4i?(X) and that (B R ^x)(x),d) is an IR-tree. More precisely, the closed ball 
(B R (x)(x),d) is also a (compact) IR-tree, since it is the closure of the corresponding open 
ball. These facts are not absolutely crucial in the arguments to come, but they make some 
proofs more elegant, so we will take them for granted and leave their proofs to the reader, 
who is also referred to Proposition 2.2.15 of [55]. 

Proposition 6.1. If f G C([a,b], X) is a local geodesic in X ; then for every t G [a, b], the 

restriction of f to [t — R(X),t + R{X)]n[a,b] is a geodesic. In particular, if c is an embedded 
cycle and x G c, then c contains a geodesic arc of length 2R(X) with mid-point x. 

Proof. The function / is injective on any interval of length at most 2i?(X), since otherwise 
we could exhibit an embedded cycle with length at most 2i?(X) = gir(X)/2. In particular, 
/ is injective on the interval [t — R(X),t + R{X)] D [a, b], and takes values in the IR-tree 
(BR(X)(f(t)),d), so that its image is a geodesic segment, and since / is parameterized by 
arc-length, its restriction to the above interval is an isometry. This proves the first statement. 

For the second statement, note that every injective path / G C([a,b],X) parameterized 
by arc-length is a local geodesic since, for every t, the path / restricted to [t — R(X),t + 
-R(X)] fl [a, b] is an injective path in the IR-tree (B R (x){f{t)), d) parameterized by arc-length, 
and hence is a geodesic. If now g : S± — > X is an injective continuous function inducing the 
embedded cycle c, it suffices to apply the previous claim to a parametrisation by arc-length 
mapping to x of the function 1 1— > g(e 2mt ). □ 

6.2 Structure of the core 

In this section, X = (X, d) is again an IR-graph. Recall that core(X) is the union of all arcs 
with endpoints in embedded cycles. 

Proposition 6.2. The set core(X) is a finite union of embedded cycles and simple arcs 
that are disjoint from the embedded cycles except at their endpoints. Moreover, the space 
(core(X),G?) is an M.-graph with no leaves. 

Proof. Assume, for a contradiction, that the union of all embedded cycles cannot be written 
as a finite union of embedded cycles. Then we can find an infinite sequence ci,C2, ... of 
embedded cycles such that q \ (ci U • • • U Cj_i) is non-empty for every i > 0, and thus 
contains at least one point Xj. Up to taking subsequences, one can assume that Xi converges 
to some point x, and that d(x,Xi) < R(X)/2 for every % > 1. Let 7^ be a geodesic from x to 
X{. this geodesic takes its values in the IR-tree B r qq(x). Since Xi G q, by Proposition 6.1 we 
can find two geodesic paths starting from x iy meeting only at x iy with length R(X) —d(x, Xj), 
and taking values in Cj fl Brqq(x). At least one of these paths 7" does not pass through x, 
and so the concatenation 7$ of 7^ and 7" is an injective path parameterized by arc-length 
starting from x and with length R(X). So it is, in fact, a geodesic path, since it takes its 
values in B r qc\(x). We let yi be the endpoint of 7*, so that d(x,yi) = R(X) for every i > 1. 
Now, we observe that if i < j, the paths ji and 7^ both start from the same point x, but since 
7" takes values in Cj, since 7" passes through Xj ^ Cj, and since d(x, x^) V d(x, xf) < R(X)/2, 
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these paths are disjoint outside the ball -Br(x)/2(x)- This implies that d(yi,yj) > -R(X) for 
every i < j, and contradicts the compactness of X. 

Therefore, the union Xq of all embedded cycles is closed and has finitely many connected 
components. By definition, core(X) is the union of Xq together with all simple arcs with 
endpoints in Xq. Obviously, in this definition, we can restrict our attention to simple arcs 
having only their endpoints in X . So let x, y £ X with x ^ y be linked by an simple 
arc A taking its values outside X , except at its endpoints. Necessarily, x and y must be 
in disjoint connected components of Xq, because otherwise there would exist a path from 
x to y in Xo whose concatenation with 7 would create an embedded cycle not included in 
Xq. Furthermore, there can exist at most one arc A, or else it would be easy to exhibit an 
embedded cycle not included in Xq. So we see that core(X) is a finite union of simple arcs 
and embedded cycles, which is obviously connected, and is thus closed. Any point in core(X) 
has degree at least 2 by definition. 

It remains to check that the intrinsic metric on core(X) is given by d itself. Let x,y £ 
core(X) and 7 be a geodesic path from x to y. Assume that 7 takes some value z = j(t) 
outside core(X). Let t 1 = sup{s < t : 7(5) £ core(X)} and t 2 = inf{s > t : 7(3) £ core(X)}, 
so that 7((ti,£ 2 )) flcore(X) = 0. Since core(X) is connected, we can join j(ti) and 7(^2) by 
a simple arc included in core(X), and the union of this arc with 7(^1,^2)) is an embedded 
cycle not contained in core(X), a contradiction. □ 

6.3 The kernel of M-graphs with no leaves 

In this section, X is an IR-graph with no leaves. We now start to prove Theorem 2.7 on the 
structure of such IR-graphs. The set fc(X) = {x £ X : deg x (x) > 3} of branchpoints of X 
forms the vertex set of ker(X). 

Proposition 6.3. The set k(X) is finite, and deg x (x) < 00 for every x £ k(X). 

Proof. By Proposition 6.2, the number of cycles of X is finite. We assume that X is not 
acyclic, and argue by induction on the maximal number of independent embedded cycles, 
that is, of embedded cycles c±, . . . , c& such that q\(ciU. . .Ucn) ^ for 1 < i < k. Plainly, 
we may and will assume that for all i £ {1, 2, . . . , k}, either Cj is disjoint from c% U . . . U 
or Cj \ (ci U . . . U Cj_i) is a simple arc of the form 7((0, 1)), where 7 : [0, 1] — > X satisfies 
7(0), 7(1) £ c\ U . . . U Cj_i. The result is trivial if X is unicyclic (k = 1). Suppose X has 
k independent embedded cycles ci, . . . , Ck as above. Consider the smallest connected subset 
X' of X containing c\, . . . , Ck-i- this subset is the union of ci, . . . , Ck-i with some simple 
arcs having only their endpoints as elements of c\ U . . . U Ck-i, and is a closed subset of X. 

If Ck does not intersect X', then there exists a unique simple arc with one endpoint 
a in Ck and the other endpoint b in X', and disjoint from Ck U X' elsewhere. Then a, b 
must be elements of k(X): a is the only element of fc(X) in Ck, we have deg x (a) = 3, and 
deg x (b) = deg x ,(b) + 1. Therefore, the number of points in k(X) is at most 2 + fc(X'), where 
X' is the set X' endowed with the intrinsic metric inherited from X. This is an IR-graph 
without leaves and with (at most) k — 1 independent cycles. 

If on the other hand Ck H X' ^ 0, then by assumption we have X = X' U A, where A 
is a sub-arc of Ck disjoint from X' except at its endpoints a, b. The latter are elements of 
fc(X), and satisfy deg x (a) < deg X /(a) + 2 and similarly for b (note that a, b may be equal). 
After we remove A \ {a, b} from X, we are left with an IR-graph X' (in the induced metric) 
without leaves, and with at most k — 1 independent cycles. 
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The result follows by induction on k. 



□ 



If X is unicyclic, then X is in fact identical to its unique embedded cycle c. In this case, 
k(X) = 0, and we let e(X) = {c}. If X has at least two distinct embedded cycles, then the 
previous proof entails that fc(X) 7^ 0, and more precisely that every embedded cycle contains 
at least one point of fc(X). The set X \ fc(X) has finitely many connected components (in 
fact, there are precisely | ^ xefc ( X ) & Q &x( x ) components, as the reader is invited to verify), 
which are simple arcs of the form 7((0, 1)), where 7 : [0, 1] — > X is such that 7 is injective 
on [0, 1), such that 7(0), 7(1) G fc(X), and such that 7((0, 1)) n fc(X) = 0. We let e(X) be 
the set of the closures of these connected components, i.e. the arcs 7QO, 1]) with the above 
notation, which are called the kernel edges. The multigraph ker(X) = (fc(X),e(X)) is the 
kernel of X, where the vertices incident to e G e(X) are, of course, the endpoints of e. An 
orientation of the edge e is the choice of a parametrisation 7 : [0, 1] — > X of the arc e or 
its reversal 7(1 — • ), considered up to reparametrisations by increasing bijections from [0, 1] 
to [0, 1]. If e is given an orientation, then its endpoints are distinguished as the source and 
target vertices, and are denoted by e~,e + , respectively. The next proposition then follows 
from the definition of fc(X). 

Proposition 6.4. The kernel of a non-unicyclic M.-graph without leaves is a multigraph of 
minimum degree at least 3. 

Finally, we prove Theorem 2.7. Assume that X is a non-unicyclic IR-graph without leaves, 
and let £(e) : e G e(X) be the lengths of the kernel edges. Note that if x,y G fc(X), then 

: (ei, . . . , e k ) a chain from x to y in G > , 

where (ei, . . . , e^) is a chain from x to y if it is possible to orient e\, . . . , e k G e(X) in such 
a way that — x, — y and ef = ej +l for every i G {1, . . . , k — 1}. Of course, it suffices 
to restrict the infimum to those chains that are simple, in the sense that they do not visit 
the same vertex twice. Since there are finitely many simple chains, the above infimum is, in 
fact, a minimum. Next, if x and y are elements of e and e' respectively, consider an arbitrary 
orientation of e, e'. Then a shortest path from x to y either stays in e (in this case e = e'), 
or passes through at least one element of fc(X) incident to e, and likewise for e'. Therefore, 

d(x, y) = d e (x, y) A min \d e (x, e s ) + d(e s , (e'f) + d e > ((e'Y , y)\ , 
s,te{-,+} L J 

where we let d e (a, b) be the length of the arc of e between a and b if a, b G e, and 00 otherwise. 
It is shown in [22, Section 3] that this formula gives the distance for the metric gluing of the 
graph with edge-lengths (fc(X),e(X), (£(e),e G e(X))). This proves Theorem 2.7. 



6.4 Stability of the kernel in the Gromov— Hausdorff topology 

In this section, we show that kernels of IR-graphs are stable under small perturbations in 
the Gromov-Hausdorff metric, under an assumption which says, essentially, that the girth 
is uniformly bounded away from 0. 

Recall from Section 3.2 that A r is the set of measured IR-graphs X such that 

min £(e) > r, 7 £{e) <l/r and s(X) < 1/r, 

e£e(X) Z — ^ 
eGe(X) 
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where it is understood in this definition that the unicyclic IR-graphs (those with surplus 1) 
are such that their unique embedded cycle has length in [r, 1/r]. It follows that the sets 
A r ,0 < r < 1, are decreasing, with union the set of all measured M-graphs. If [X,d] is 
an R-graph, we write [X, d] G A r if [X, d, 0] G A r . Note that an element X G A r has 
gir(X) > r. Likewise, we let A' be the set of (isometry equivalence classes of) safely pointed 
measured IR-graphs (X, d, with (X, d, /i) G A r (recall Definition 3.2), and say that a 
pointed M-graph (X, d, x) G A' r if {X, d, x, 0) G A*. 

A subset A of X is said to be in correspondence with a subset A' of X' via C C X x X' if 
(7fl(Ax A') is a correspondence between A and A'. Let X and X' be M-graphs with surplus 
at least 2. Given C G C(X, X'), for e > we say that C is a e-overlay (of X and X') if 
dis(C) < e, and there exists a multigraph isomorphism x between ker(X) and ker(X') such 
that: 

1. For every v G fc(X), (u, e C- 

2. For every e G e(X), the edges e and x( e ) are i n correspondence via C, and 

|£(e) -£( X (e))| < e. 

If s(X) = s(X') = 1, an e-overlay is a correspondence with distortion at most e, such that the 
unique embedded cycles c, d of X and X' are in correspondence via C, and \£(c) — £(d)\ < e. 
Finally, if s(X) = s(X') = then an e-overlay is just a correspondence of distortion at most e. 

Proposition 6.5. Fix r G (0, 1). For every e > there exists 5 > swc/i i/iai i/X = (X, d) 
and X' = (X',d') are elements of A r and C G C(X, X') /ias dis(C) < 5, t/ien t/iere exists an 
e-overlay C G C(X,X') with C C C". 

We say that a sequence of graphs with edge-lengths ((V n ,E n ,(l n (e),e G E n )),n > 1) 
converges to the graph with edge-lengths (V, E, (/(e), e G if (V^, £? n ) and (V, £?) are 
isomorphic for all but finitely many n > 1, through an isomorphism \n such that l n (Xn( e )) ~^ 
1(e) as n — > oo for every e G -E. We now state some consequences of Proposition 6.5 which 
are used in the proof of Theorem 4.1 and in Section 7.3, before proceeding to the proof of 
Proposition 6.5. Recall the definition of the distances d^p from Section 2.1. 

Corollary 6.6. Fixr G (0, 1). Let (X n = (X n , d n , /i n ), n > 1) andX = (X,d,(i) be elements 
of A r - Suppose that dcHp(X n ,X) — > 0, as n — » oo. 

(i) Then ker(X n ) converges to ker(X) as a graph with edge-lengths. As a consequence, 
r(X n ) — > r(X) ; and writing L n (resp. L) for the restriction of the length measure of X n 
(resp. X) to conn(X n ) (resp. conn(X) y ), it holds that 

d°^ P ((X",d",^,L"),(X,d,/i,L)) — ► 0. 

n— >oo 

(ii) Let x n be a random variable in X n with distribution L n /L n (conn(X")) and x be a random 
variable in X with distribution L/L(conn(X)). Taen as n — >• oo, 

(X n ,d n ,x n ,/i n ) 4 (X,d,x,fj) 

in the space (M. 1 ' 1 , d^p). 
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The above results rely on the following lemma. Given metric spaces (X, d) and (X',d'), 
C C X x X' and r > 0, let 

C r = {(y,y') G X x X' : d(x, y) V d'(x', y') < r for (x, x') G C}. 

C r is the r-enlargement of C with respect to the product distance. Note that if C is a corre- 
spondence between X and X', then C r is also a correspondence for every r > 0. Moreover, 
dis(C r ) < dis(C) + 4r. A mapping : [a, b] — > [a', b'] is called bi-Lipschitz if is a bijection 
such that and 0" 1 are Lipschitz, and we call the quantity 

K(tf) = inf lyK > 1 : K~ x \x — y\ < \(f)(x) — <j>(y)\ < K\x — y\ for every x, y E [a, 6]| 

the bi-Lipschitz constant of 0. By convention, we let K((f>) = oo if is not a bijection, or 
not bi-Lipschitz. 

Lemma 6.7. Fix r G (0, 1) and let (X,d), (X',d!) G A r . Suppose there exists a correspon- 
dence C between X and X' such that dis(C) < r/56. 

Let x,y G X be two distinct points in X, and let f be a local geodesic from x to y. Let 
x',y' G X' be such that (x,x'), (y,y') G C. Then there exists a local geodesic f from x' to y' 
with 



M/0 4 + ^)-M/), 



and a bi-Lipschitz mapping : [0, len(/)] — > [0, len(/')] such that (f(t),f'(<f>(t))) G Csdis(c) 
for every t G [0,len(/)], and 

Note that the second part of the statement also implies a lower bound on the length of 
/', namely, 

len(/') > m-Henif) > len(f) (l - + , 

which is, of course, useless when r A len(/) < 64dis(C). 

Proof. Let us first assume that < len(/) < r/8, so in particular d(x,y) < R(X) and / is 
the geodesic from x to y. We have 

d(x,y) - dis(C) < d'(x',y') < d(x,y) + dis(C) < r/8 + dis(C) < R(X') , 

so that x' and y' are linked by a unique geodesic /'. Set 0(t) = d'(x', y')t/d(x, y) for < t < 
d(x,y). From the preceding chain of inequalities, we obtain that 

dis(C) V 1 



len(/') < len(/) + dis(C) , and K(<j>) < (l - 



+ 



Fix z = fit) G Im(/) and let z" be such that (z,z") G C. Then d'(x',z") < d(x,z) + 
dis(C) < r/4, so that z" belongs to the M-tree B^x'){x')- Let z' be the (unique) point of 
Im(/') that is closest to z". Then a path from x' or y' to z" must pass through z', from 
which we have 

d v, A = *v^ + *uf)-'V,yf) < l MC) . (32) 
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Therefore, 



t - -dis(C) < d'{x', z") - d'{z", z) < d'{x', z') < d'{x', z") < t + dis(C) 



so after a short calculation we get that 

d'(x',z') - 

From this we obtain 
d'{z'J'm))) = d'(f{d'{x\z'))J' 



d(x,y) ' 

d'(x',y') 
d(x,y) 



ji i i i\ d {x , y ) 

d(x,z)- — -t 

d{x,y) 



< 2 dis (C) 



so that, in conjunction with (32), we have (/(£), f'((j)(t))) e C5dis(c) 

We next assume that len(/) > r/8. Fix an integer N such that r/16 < len(/)/iV < r/8 
and let ti = i\en(f)/N and Xi = f(ti) for < i < N. By Proposition 6.1, since / is a local 
geodesic and t i+ i — < i?(X) for every z e {1, . . . , JV — 1}, the restriction f\[t l _ 1) t i+1 ] must 
be a shortest path and so 



d(xi-i,x i+1 ) 



21en(/) 
N 



<r/4, 



len(/) 
N 



E [r/16, r/8] 



Letting be a point such that (xj,x") G C (where we always make the choice x'q = x' and 
x" N = y'), we have d'(x", x" +1 ) < d(xi,x i+ i) + dis(C) < -R(X'), so that we can consider the 
unique geodesic //' between x" and x" +1 . The concatenation of the paths /q, f", . . . , /jv_i is 
not necessarily a local geodesic, but by excising certain parts of it we will be able to recover 
a local geodesic between x' and y'. For each i 6 {1, . . . , N — 1}, the sets Im(/^ 1 ) and Im(/") 
are included in the M-tree B R nt')(x"), and the concatenation of f"_ x and f" is a path from 
x"_ x to a;'/ +1 which, as such, must contain the image of the geodesic ^ between these points. 
Let x\ be the unique point of Im((yfj) that is closest to x", and let x' = x', x' N = y'. Then 



ii/ i n\ d (x i _ 1 , x i ) + d'(x" +1 , x ; ) d {x'l_ x , x i+1 ) 3 
rf fo, ) = 7, < 2-dis(C) 



so that, for i € {0,1,..., N}, 



d(xi,x i+ i)-dis(C) < d'(x",x" +1 ) < rf'(x-,x- +1 ) < d'(x",x" +1 )+3dis(C) < d{x h x i+ i)+4dis(C) . 



Ifx^Glm^x) then 

d'{x>U,x>U x ) < d>{x'U,x>!) + ^dis(C) < ^£ + ^dis(C) . 
However, since (xj_i, (xj + i, ) £ C and dis(C) < r/56 < 21en(/)/(7JV), we have 

* > ^ - dis(c) > ^> + \wO, 

so, in fact, x' i+l £ lm.(f' i , _ l ) and, in particular, x' i+l does not lie on the shortest path between 
x\_ x and x\. From this it follows that if /' denotes the concatenation of the geodesic j[ 
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between x\ and x' i+1 , for < i < N — 1, then /' is a local geodesic between x' and y' . Its 
length is certainly bounded by the sum of the lengths of the paths f", so that 

len(/') < f; d( Xi , x i+1 ) + 4iVdis(C) < len(/) + We*(f)MC) j 
as claimed. Next, we let fa(t) = d'(x' i ,x' i+l )t/d(xi,Xi + i), so that 

\ -Li+i) / \ ' / 

If : [0, len(/)] —> [0, len(/')] is the concatenation of the mappings fa, < i < N — 1, then 
is bi-Lipschitz with the same upper-bound for K(fa) as for each K(fa). Finally, we note 
that /' o (j) is the concatenation of the paths f[ o fa. If z — fi(t), we let z" be such that 
(z, z") G C and let 2' be the point in Im(f-) that is closest to z". Then similar arguments to 
before entail that d'(z',z") < 3dis(C), so that 

t - 4dis(C) < d'ix'i, z') < d\x\, z") < t + dis(C) , 

which implies that 



d'(z',fM(t))) 



d'(x' i} z') 



d'jx'j, x 'i+i) _ 



i+i, 



< 5dis(C) 



and we conclude that (/»(*), fi(fa(t))) e C8dis(c), and so that (f(s),f(<f>(s))) G Cs^c) for 
every s G [0, len(/)]. □ 

Proof of Proposition 6. 5. We will prove this result only when one of X and X' (and then, in 
fact, both) has surplus at least 2, leaving the similar and simpler case of surplus 1 to the 
reader (the case of surplus is trivial). Also, we may assume without loss of generality that 
e < r/4. 

Fix e G (0, r/4), and fix any 5 G (0,er 2 /128). Also, fix X, X' G A r and a correspondence 
C G C(X,X') with dis(C) < 5. List the elements of k(X) as V\, ... ,v n , and fix elements 
v", . . . ,v" of X' with (vi,v") G C for each 1 < i < n. Since dis(C) < S and v±, . . . ,v n are 
pairwise at distance at least r, v", . . . , v'^ are pairwise at distance at least r — 25 > r/2 and, 
in particular, are all distinct. Next, for every e G e(X), say with e + = Vi, e~ = Vj, fix a local 
geodesic f e between Vi and Vj with Im(/ e ) = e and / e (0) = e~. By Lemma 6.7, there exists 
a geodesic f" from t>" to v'J and a bi-Lipschitz mapping e : [0, £(e)] — > [0, lea(f" )] with 

*(*.)< (l-^)"'< 2, 

and such that (f e (t),f"((j) e (i))) G C 8 5 for every t G [0,£(e)]. In particular, it follows that 
len(/g') > r/2. Then we claim that for S small enough, the following two properties hold. 

1. For every e G e(X), the path (/"(*), e/8 < t < len(/^') - e/8) is injective. 

2. For ei, e2 G e(X) with ei 7^ e2, we have 

{%{t) :e/8<t< lea(^) - e/8} fl {/^(t) : e/8 < t < len(/^) - e/8} = . 
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To establish the first property, suppose that f"(t) = f'J(t') for some e G e(X) and distinct 
t,t' G [e/8,len(/g) — e/8]. For concreteness, let us assume that e~ = i>j and e + = Vj. 
Since f'J is a local geodesic, this implies that \t — t'\ > -R(X') > r/4. Moreover, since 
(fefa 1 ®), f'J(t)), (fe^eHt')), fe(t')) E C 85 and since 5 < e/128, we have 

dC/.C^Ct)), fei^it'))) < d'(f'J(t), f'J(t')) + 85 = 85 < e/16 . (33) 

On the other hand, we have 

IC 1 (*) - r e \t')\ > K^Y^t - 1'\ > \\t - t'\ > r/8 > e/2 , 

and since ^(O) = and ^ 1 (len(^')) = ^(e), 

IC'WI > > e/16, — ^(e)| > e/16, 

and similarly for t' . But if s,s' G [e/16,£(e) - e/16], then d{f e {s),f e {s')) > (e/8) A \s - s'\, 
because a path from f e (s) to f e (s') is either a subarc of e, or passes through both vertices 
Vi and fj. It follows that c?(/ e (0~ 1 (t)), / e (0 f T 1 (t / )) > e/8, in contradiction with (33). This 
yields that property 1 holds. 

The argument for property 2 is similar: for every ti G (e/8, len(/g i ) — e/8) and t 2 G 
(e/8, len(/" 2 ) — e/8), there exist x\ G e± and x 2 G e<i such that f'JAti)) and (22, f" 2 {t 2 )) 
are in Cg^. Then the distance from £1,22 to fc(X) is at least e/16 so that d(xi,X2) > e/8. 
From this, we deduce that d'tf'^tx), f" 2 (t 2 )) > d(x u x 2 ) - 85 > 0. 

Next, for every i G {1, . . . , n}, consider the points /"(e/8), e G e(X) for which e~ = t>j, 
as well as the points /" (len(/") — e/8) for which e + = t>j. These points are on the boundary 
of the ball B e / 8 (v"), which we recall is an IR-tree. Let T be the subtree of B e / 8 (v") spanned 
by these points. Then property 1 above shows that 

U T * U U {/;W,e/8<t<len(/;)-e/8} 

l<i<n eee(X) 

induces a closed subgraph of (X', d') without leaves, and so this subgraph is in fact a subgraph 
of core(X'). Furthermore, property 2 implies that the points of degree at least 3 in this 
subgraph can only belong to (Ji<i<n Since any such point is then an element of fc(X') and 
diam(Tj) < e/4 < r, we see that each Tj can contain at most one element of /c(X'). On the 
other hand, each Tj must contain at least one element of fc(X') because Tj has at least three 
leaves (since t>j has degree at least 3). Thus, each Tj contains exactly one element of /c(X'), 
which we denote by v[. Next, for e G e(X'), if e~ = v j, e + = Vj, then we let f' e be the simple 
path from v \ to v'j that has a non-empty intersection with f". It is clear that this path is 
well-defined and unique. Letting xi v i) = for 1 < i < n, and letting x( e ) — im (/e) f° r 
e G e(X), we have therefore defined a multigraph homomorphism from ker(X) to ker(X'), and 
this homomorphism is clearly injective. By symmetry of the roles of X and X', we see that 
|&(X)| = |fc(X')| and |e(X)| = |e(X')|, and so x must, in fact, be a multigraph isomorphism. 
Finally, since len(/ e ) = £(e) < 1/r, we have 

645 645 e 

W) - MO = M/.) - Mtf)l < — M/.) < -r < ~ 2 , 

by our choice of 5. But, by construction, \\en(f'J) — £(x{e))\ < e/2, since the endpoints of x(e) 
each have distance at most e/4 from an endpoint of /"(e). It follows that \£(e) — £(x{ e ))\ < e - 
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Finally, since every point of x( e ) is within distance e/4 of f" and e = Im(/ e ) and Im(f'J) 
are in correspondence via C$s, it follows that e and x(e) are in correspondence via Css+e/4- 
Since dis(Cs5+ e /4) < dis(C) + 165 + e/2 < e, this completes the proof. □ 

Proof of Corollary 6.6. Again we only consider the case s(X) > 1, the case s(X) = 1 being 
easier (and the case s(X) = trivial). 

Let (X n , n > 1) and X be as in the statement of Corollary 6.6. Let (C n ,n > 1) 
and (7r n , n > 1) be sequences of correspondences and of measures, respectively, such that 
dis(C n ), Tc n ((C n ) c ) and D(ir n ; p n , p) each converge to as n — >• oo. The fact that ker(X n ) 
converges to ker(X) as a graph with edge-lengths is then an immediate consequence of Propo- 
sition 6.5: for each n sufficiently large, simply replace C n by C n U C n , where C n is an e n - 
overlay of X and X n , for some sequence e n — > (we may assume e n > dis(C n )). We continue 
to write C n instead of C n U C n , and note that enlarging C n diminishes 7r n ((C n ) c ). 

In particular, we obtain that for all large enough n, there is an isomorphism \ n from 
(A;(X),e(X)) to (A;(X n ), e(X n )) such that £{e) - £(Xn(e)) converges to for every e G e(X). 
The fact that r(X n ) — > r(X) is immediate. We now fix a particular orientation of the edges, 
and view \n as an isomorphism of oriented graphs, in the sense that Xn(e~) = Xn(e)~. 

For each e G e(X), let f e be a local geodesic between e~ and e + with / e (0) = e~ and 
f e (£(e)) = e + and, for each n > 1 and e G e(X n ), define / e accordingly. Then for each n 
sufficiently large, define a mapping $ n with domain dom($ n ) = IJeee(x) /e([0>^( e ) ~ e n]) by 
setting $ n (/ e (t)) = f™iAt) for each e G e(X) and each < £ < £(e) - e n . 

By considering a small enlargement of C n , or, equivalently, by letting e n tend to zero 
sufficiently slowly, we may assume without loss of generality that (x,<& n (x)) G C n for all 
x G dom($ n ). This comes from the fact that e and Xn( e ) are i n correspondence via C n ; we 
leave the details of this verification to the reader. It follows that the relation {(x, $„(x)) : 
x G dom($ n )} is a subset of C n . 

Let e c (X) be the set of edges e G e(X) whose removal from e(X) does not disconnect 
ker(X). Clearly, conn(X) C fc(X) U U e ee c (x) e ' anc ^ ^ ne measure L is carried by U e ee c (X) e 
(in fact, it is carried by the subset of points of U e ee c (x) e w ith degree 2, by Proposition 2.6). 
Let L' be the restriction of L to the set dom($ n ), which has total mass X] e ee c (x) W e ) — e «)- 
We consider the push-forward p n of V by the mapping x \- > (x, $ n (x)) from X to X x X n . 
Then the second marginal of p n is the restriction of L n to U e ee c (x) ^ m (fe)i so ^ na ^ 

D( Pn ,L,L n )< (e n + \£(e) - £(Xn(e))\) . 

eee(X) 

The latter converges to by the convergence of the edge-lengths. It only remains to note 
that p n (X x X n \ C n ) = by construction. This yields (i). 

Finally, (i) implies that (X n , d n , p n , L n /L n (conn(X™))) converges to (X, d, p, L/L(conn(X))) 
in the metric d^ p introduced in Section 2.1, and (ii) then follows from Proposition 2.1. □ 

7 Cutting safely pointed M-graphs 

In this section, we will consider a simple cutting procedure on M-graphs, and study how this 
procedure is perturbed by small variations in the Gromov-Hausdorff distance. 
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7.1 The cutting procedure 

Let (X, d) be an M-graph, and let x G conn(X). We endow the connected set X \ {x} with 
the intrinsic distance dx\{ x y- more precisely, dx\{ x }(y, z ) is defined to be the minimal length 
of an injective path not visiting x. This is indeed a minimum because there are finitely 
many injective paths between y and z in X, as a simple consequence of Theorem 2.7 applied 
to core(X). The space (X \ {x},dx\{ x }) is not complete, so we let (X x ,d x ) be its metric 
completion as in Section 3.2. This space is connected, and thus easily seen to be an IR-graph. 
We call it the M.-graph (X, d) cut at the point x. 

From now on, we will further assume that deg x (x) = 2, so that (X, d, x) is safely pointed 
as in Definition 3.2. In this case, one can provide a more detailed description of (X x ,d x ). 
A Cauchy sequence (x n ,n > 1) in (X \ {x},dx\{ x }) is also a Cauchy sequence in (X, d), 
since d < dx\{ x }- If its limit y in (X, d) is distinct from x, then it is easy to see that 
dx\{ x }(x n ,y) — > 0, by considering a ball B e (y) not containing x within which d = dx\{ x }- 

So let us assume that (x n , n > 1) converges to x for the distance d. Since x has de- 
gree 2, the M-tree B R (x)(x) \ {%} has exactly two components, say Yi,Y 2 . It is clear that 
dx\{ x }( z i, z 2 ) > 2R(X) for every z\ G Yi, z 2 G Y 2 . Since (x n , n > 1) is a Cauchy sequence for 
(X \ {x}, dx\{ x }), we conclude that it must eventually take all its values in precisely one of Y\ 
and Y 2 , let us say Y\ for definiteness. Note that the restrictions of d and dx\{ x } to Y\ are equal, 
so that if (x' n , n > 1) is another Cauchy sequence in (X\{x}, d x \{ x }) which converges in (X, d) 
to x and takes all but a finite number of values in Yx, then d x \{ x }(x n , x' n ) = d(x n ,x' n ) — > 0, 
and so this sequence is equivalent to (x n ,n > 1). 

We conclude that the completion of (X \ {x}, d x \{ x }) adds exactly two points to X \ {x}, 
corresponding to classes of Cauchy sequences converging to x in (X, d) "from one side" of x. 
So we can write X x — (X \ {x}) U {xn^x^} and describe d x as follows: 

• If y, z ^ {x(i), X(2)} then d x (y, z) is the minimal length of a path from y to z in X not 
visiting x. 

• If y ^ £(2) then ^(^(i), 2/) is the minimal length of an injective path from x to y in 
X which takes its values in the component Y\ on some small initial interval (0, e), and 
similarly for d(x( 2 ),y) with y ^ xm. 

• Finally, d x (x(\), X( 2 )) is the minimal length of an embedded cycle passing through x. 

If (X, d, x, /x) is a pointed measured metric space such that (X, d, x) is a safely pointed 
IR-graph, and yu({x}) = 0, then the space (X x , d x ) carries a natural measure //, such that 
n\{x(i), X( 2 )}) = and, for any open subset A C Xj. not containing xm and X( 2 ), ti>'(A) = 
fi(A) if on the right-hand side we view A as an open subset of X. Consequently, there is 
little risk of ambiguity in using the notation \i instead of // for this induced measure. 

We finish this section by proving Lemma 5.5 on the number of balls required to cover 
the cut space. 

Proof of Lemma 5.5. Let Bi, B 2 , . . . , B N be a covering of X by open balls of radius r > 0, 
centred at Xx, . . . respectively. By definition, any point of X can be joined to the centre 
of some ball Bi by a geodesic path of length < r. If such a path does not pass through x, 
then it is also a geodesic path in X x . Now since Si, ... , B^ is a covering of X, this implies 
that any point y in X can either 

• be joined to some point Xi by a path of length < r that does not pass through x, 
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• or can be joined to x through a path 7 of length < r. 

In the first case, this means that y belongs to the ball with centre Xi and radius r in X x . 
In the second case, depending on whether the initial segment of 7 belongs to Y\ or Y2, this 
means that y belongs to the ball with centre or xm with radius r in X^. This yields a 
covering of X^. with at most N + 2 balls, as desired. 

Conversely, it is clear that if N balls are sufficient to cover X x then the same is true 
of X, because distances are smaller in X than in X x (if x is identified with the points 

{X(1),Z (2) }). □ 

7.2 Stability of the cutting procedure 

The following statement will be used in conjunction with Corollary 6.6 (ii). Recall the 
definition of A' from the start of Section 6.4. 

Theorem 7.1. Fix r G (0,1). Let (X n , d n , x n , /i n ), n > 1 and (X,d,x,fx) be elements of A*. 
Suppose that 

d^ P ((X™,^,x n ,/i"),(X,d,x, /U )) — ► 0, 
and that fi n ({x}) = fi({x}) = for every n. Then 

^gup ((X xn ,d xn , f/ 1 ), (X x ,d x , /j,)) — >■ 0. 

n— >oo 

Our proof of Theorem 7. 1 hinges on two lemmas; to state these lemmas we require a few 
additional definitions. Let X = (X, d, x, /1) G A' and recall the definition of the projection 
a : X — > core(X). For e > 0, write 

B e (x) = {y G X : d(a(y),x) < e} and h e (X) = diam(_B e (x)) , 

so that B t (x) C B e (x). The sets B 6 (x) decrease to the singleton {x} as e I 0, because 
deg x (x) = 2. Consequently, h e (X) converges to as e J, 0. For e > sufficiently small, the 
set X Xjt = X\B e (x), endowed with the intrinsic metric, is an M-graph. In fact, it is easy to 
see that for e < R(X), this intrinsic metric is just the restriction of d x to X x>e . 

Let us assume that e < d(x,k(X)) A R(X). Let £(i), e , ^(2),e be the two points of core(X) 
at distance e from x, labelled in such a way that, in the notation of Section 7.1, xn\ e is the 
point closest to xm in X x , or in other words, such that xn\ e G Yi. For i G {1, 2} let Pi be the 
geodesic arc between x^ j6 and x in X. We let B^ >€ = {w G B e (x) \ {x} : a{w) G Pi} U {x^}, 
which we see as a subset of X x . See Figure 4 for an illustration. 

Now let X' = (X', d', x', ji') G A'.. Just as we defined the space X x = (X x ,d x ), we define 
the space X' x , = (X' x ,,d' x ,), with X' x , = (X' \ {x'}) U {x'^,x'^}. We likewise define the sets 

B e (x') and BLs e , B'^ e and the points x'^ e ,x'^ e for e < d(x', k(X')) A R(X') as above. We 
will use the same notation a for the projection X' — > core(X'). 

Lemma 7.2. Fix 5 > 0. If C is a 5-overlay o/X and X' then for every (y,y') G C , we have 
(a(y),a(y')) G C 2S - 

Proof. Let y" be such that (a(y),y") G C. Since C is a 5-overlay, c?'(?/", core(X')) < 5. In 
particular, if a(y") = a(y') then we have d'(a(y'),y") < 5. Otherwise, a geodesic from y' to 
y" must pass through a(y') and a(y"), so that 

dV, a(j/ )) + d'(a(y'), y") = d'(y', y") < d(y, a(y)) + 5 . 
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B e (X) 



Figure 4: Part of an IR-graph: core(X) is in thicker line. 

On the other hand, since C is an 5-overlay, we know that core(X) and core(X') are in 
correspondence via C, which implies that 

d'(y', a(y')) = d'(y', core(X')) > d(y, core(X)) - 6 = d(y, a(y)) - 5 , 

so that d'(a(y'),y") < 25. In all cases, we have (a(y),a(y f )) G C 2 s, as claimed . □ 

Lemma 7.3. Fix r G (0, 1). For all e > there exists rj > such that if dQ^ p (X, X') < rj 
then 

d GUP (X x ,X' x ,) < fj.{B e {x)) +fi'(B e (x')) 

+ 3m^ e (x) ) / te (x0) + 7r 2 diain(X) ^ dlam(X/) V l)e 

Proof. Since dgj^X, X') < rj we can find Cq G C(X, X') with dis(Co) < 77 and with (sc, x') G 
Co, and a measure n with -D(7r; /i, //) < 77 and vt(Cq) < r\. Fix 5 > such that 5 < e/10 and 
5 < r/56. By choosing rj < 5 sufficiently small, it follows by Proposition 6.5 that there exists 
a 5-overlay C of X and X' with C C C, so in particular (x,x f ) G C and tt(C c ) < r\ < 5. We 
also remark that D(tt; fi, //) < 5. 

We next modify C to give a correspondence between X x>e and X^,, e by letting 

= (c n (x Ii£ x uAiua 2 u a; u 4 , 

where for i G {1,2}, we define 

A, = {(y, 4 )i£ ) : (y, |/) G C H (X», e X flfoj} , 
4 = {(x {i)>e ,y') : (y,y') 6Cn(% ( x X' x , >e )}. 

To verify that is indeed a correspondence between X x>e and X^,, e , it suffices to check 
that there does not exist y G X X}e for which (y, x') G C, and similarly that there does not 
exist y' G X' x , e for which (x, y') ^ C. In the first case this is immediate since d(x, y) > e, 
so for all y' G X' with (y,y') G C we have d'(x',y') > e — 5 > 0. A symmetric argument 
handles the second case. 
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We next estimate the distortion of C^ e ' when X x>e and X' x , e are endowed with the metrics 
d x and d' x , respectively. To this end, let (y,y'), (z,z f ) G . We have to distinguish several 
cases. The simplest case is when (y,y') and (z,z f ) are, in fact, both in C. In particular, 
y, z G Xj, e and y', z' G X^,, e . Let / be a geodesic from y to z in X x>e , i.e. a local geodesic 
in X x>e not passing through x and with minimal length. Let /' be the path from y 1 to z' 
associated with / as in Lemma 6.7, which we may apply since 5 < r/56. We claim that /' 
does not pass through x'. Indeed, if it did, then we would be able to find a point x G Im(/) 
such that (x ,x') G C$$- Since also (x,x') G C$s, it would follow that 

d(x, Im(/)) < d(x, x ) < d'(x', x') + 85 < e , 

contradicting the fact that / is a path in X Xj£ . By Lemma 6.7, we deduce that 

d' xl {y',z ) < d x (y,z)( 1 + 



r A d x (y,z) 
d x (y,z) + 64( d ^-Vl)S 



<4(.,,)+64( 2dia ^Vl) 5 , (34) 

where at the last step we use that d x (y, z) < diam(X x ) < 2diam(X). 

Let us now consider the cases where (y, y') ^ C, still assuming that (z, z') G C. There 
are two possibilities. 

1. There exists y" G B',^ e with (y, y") G C and i G {1, 2}, and so y' = x'^ e . 

2. There exists y G Bu^ e with (y, ?/') G C and z G {1, 2}, and so y = £(i), e - 

Let us consider the first case, assuming i = 1 for definiteness. The argument leading to (34) 
is still valid, with y" replacing y'. Using d' x ,(y",x'^ J = d'(y",x'^ e ) < h e (x'), we obtain 

d'Av', *') < d x (y, z) + Qa( 2diam(X) V 1 V + K{x') . 



In the second case (still assuming i = 1 without loss of generality), we have to modify 
the argument as follows. We consider a geodesic / from y to z in (X x ,d x ). We /' be 
the associated path from y' to z' (again using Lemma 6.7), and claim that x' ^ Im(/'). 
Otherwise, / would visit a point at distance less than 85 from x. On the other hand, the 
point of Im(/) that is closest to x is a(y). But by Lemma 7.2, we have 

d{x, a(y)) > d'(x', a(y')) - 25 > e - 25 > 85 . 

Finally, since y = £(i), e we obtain that d x (y, z) < d x (y, z) + h e (x), and the argument leading 
to (34) yields 

d'Av', A < d x (y, z) + K{x) + 64 ^ 2dia ^( X ) v i V 

Arguing similarly when (z, z') is no longer assumed to belong to C, we obtain the following 
bound for every (y, y'), (z, z') G 

d' x ,{y',z') < d x (y,z) + 2{h e (x) V h e (x')) + 64 p diam ( X ) vl ) 5 . 



\ r 
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Writing h e = h e (x) V h e (x'), by symmetry we thus conclude that 



dis(^) < 2h e + 64 ( 2 diam(X)Vdiam(XQ 



V 1 )5 



where the distortion is measured with respect to the metrics d x and d' x ,. Now let be the 
^-enlargement of with respect to d x and d! x ,. Since is a correspondence between 
X^ and X^,/ e , and all points of X x (resp. X' x ,) have distance at most h t from X Xj(E (resp. 

X^., e ) under d x (resp. d^./), we have that is a correspondence between X x and X^,, of 
distortion at most 

3>»,W2 diam(X)Vdiam(X ' ) VlV 



Finally, since D{%; /i, jjl) < 5 and vr(C c ) < 5, and since (C n (X^ x X^J) C C C (e \ 
we have 

tt((CM) c ) < tt(C c ) + 7r(5 e (x) x X') + tt(X x £ e (V)) < 5 + n{B € {x)) + ^'{B e {x')) . 

Since 655 < 6.5e < 7e, the lemma then follows from the two preceding offset equations and 
the definition of the distance dcHP- □ 

Proof of Theorem 7.1. Fix e > 0. Under the hypotheses of the theorem, for all n large 
enough, by Lemma 7.3 we have 

d GHP (X x ,X™„) < n(Bt{x)) +fi'(B t (x n )) +3max(/ Jf (i) 1 / i£ (x n )) 

diam(X) V diam(X") v 



It is easily checked that, for all e > 0, 

limswph e (x n ) < h,2e(x) , limsup/i n (5 e (x ri )) < fi(B2 e (x)) , 

which both converge to as e — > 0. The result follows. □ 



7.3 Randomly cutting M-graphs 

Let X = (X, d, x) be a safely pointed IR-graph, and write L for the length measure restricted 
to conn(X). Then X x = (X x , d x ) is an IR-graph with s(X x ) = s(X) — 1. Indeed, if e is the edge 
of ker(X) that contains x, then it is easy to see that ker(X x ) is the graph obtained from X by 
first deleting the interior of the edge e, and then taking the kernel of the resulting IR-graph. 
Taking the kernel of a graph does not modify its surplus, and so the surplus diminishes by 1 
during this operation, which corresponds to the deletion of the edge e. Moreover, we see that 
A r is stable under this operation, in the sense that if (X, d) e A r , then for every x such that 
(X,d,x) is safely pointed, the space (X x ,d x ) is again in A r . Indeed, the edges in ker(Xa,) 
are either edges of ker(X), or a concatenation of edges in ker(X), and so the minimum edge- 
length can only increase. On the other hand, the total core length and surplus can only 
decrease. 

Let us now consider the following random cutting procedure for IR-graphs. If (X, d) is an 
IR-graph which is not an IR-tree, then it contains at least one cycle, and by Proposition 2.6, 
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^-almost every point of any such cycle is in conn(X). Consequently, the measure L = 
£(■ n conn(X)) is non-zero, and we can consider a point x chosen at random in conn(X) with 
distribution L/L(conn(X)). Then (X,d,x) is a.s. safely pointed by Proposition 2.6. Let 
/C(X, •) be the distribution of (X x , d x ). By convention, if X is an M-tree, we let /C(X, ■) = <5{x}- 
By combining Corollary 6.6 (ii) with Theorem 7.1, we immediately obtain the following 
statement. 

Proposition 7.4. Fix r > 0, and let (X n ,n > 1) and X be elements of A r such that 
dcHp(X n , X) -)■ as n -> oo. Then /C(X n , ■) 4- /C(X, •) in (M, d GH p), as n -)• oo. 

In particular, K defines a Markov kernel from A r to itself for every r. Since each appli- 
cation of this kernel decreases the surplus by 1 until it reaches 0, it makes sense to define 
/C°°(X, •) to be the law of /C m (X, •) for every m > s(X), where K, m denotes the m-fold com- 
position of /C. The next corollary follows immediately from Proposition 7.4 by induction. 

Corollary 7.5. Fix r > 0, and let (X n ,n > 1) andX be elements of A r with dcHp(X n ,X) —t 
as n -> oo. Then /C°°(X n , ■) A /C°°(X, •) in (M, d GHP ) ; as n oo. 

This proves Theorem 3.3. 



List of notation and terminology 

x Y{\) x /(A) if for all a > 1, P (Y(X) [/(A)/a, a/(A)]) = oe(A) 35 

a{x) Point of attachment of x to core(X) 14 

At Set of "r-uniformly elliptic M-graphs" 17 

A' Set of safely pointed elements of A r 43 

arc The image of a path; see same section for simple arc 13 

branchpoint Point of degree at least three in an M-graph X 14 

C([a, b],X) Set of paths from a to 6 13 

conn(X) Set of points of core(X) such that X \ {x} is connected 15 

core(X) See Definition 2.3 14 

C r The r-enlargement of correspondence C 44 

cut(X) Random M-graph with distribution /C°°(X, •) 19 

C(X, X') Set of correspondences between X and X' 10 

cycle Embedded cycle: image of continuous injective / : Si — > X; see same section 

for acyclic and unicyclic metric spaces 13 

degx(x) Degree of x in M-graph or M-tree X 13 

dcH(X, X') Gromov-Hausdorff distance between X and X'; equal to j;inf{dis(C7) : C € 

C{X,X')}. See same section for d^ H (X,X') 10 

dGHp(X, X') Gromov-Hausdorff-Prokhorov distance between X and X'; see the same section 

for d^ p (X,X') 11 

dn Hausdorff distance 26 

diam((X, d)) Equal to sup^ y&x d(x, y) 10 

dimM(X) Minkowski dimension of X; see same section for dim M (X) and dimM(X). ... 34 

dis(C) Distortion of the correspondence C; equal to sup{|d(x, y) — d'(x' , y')\ : (x, x') £ 

C,(y,y')GC} 10 

di The intrinsic distance associated with (X, d) or with a subset Y C X 13 

D(ir; u, //) Discrepancy of ir G M(X, X') 11 

E(G) Set of edges of the graph G 2 

e-overlay See definition in text 43 

e(X) The edges of the kernel ker(X) 42 
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Subgraph of MP with edges E(M n ) \ E(M% ); its components are {¥l(v),v G 

^(M™' 1 )} 28 

geodesic Definition in text; see same section for geodesic arc, geodesic space 13 

gir(X) The girth of X is inf{len(c) : c is an embedded cycle in X} 39 

G",G",^\ Sequence of components (G^'*,z > 1) of G(n, 1/n + A/n 4 / 3 ); see same section 

for measured metric space version G^ and limit £f\ 20 

ker(X) Kernel (A;(X),e(X)) of M-graph X; see also Section 6.3 15 

K n Complete graph on {1, . . . , n} 2 

K{4>) Bi-Lipschitz constant of (ft 44 

k(X) Set of branchpoints of M-graph X 14 

/C(X, •) Cycle-breaking Markov kernel on M-graph X. See same section and also Sec- 
tion 7.3 for JC, IC n and /C°° 17 

K(G, •) Cycle-breaking Markov kernel on finite multigraph G 16 

£ Length measure on an M- graph X 14 

L Length measure restricted to conn(X) 17 

A™ Last time a new component takes the lead; see Proposition 4.11 for subsequen- 

tial limit A 26 

len(/) The length of path / 13 

local geodesic Definition in text 13 

(Lp, distQ H p) Set of sequences of measured metric spaces, with the distance distQ HP 12 

£(X) Set of leaves of X 13 

mass(X) For a measured metric space X = (X, d, u), equal to fJ-(X) 11 

(M.,(1gh) Set of isometry classes of compact metric spaces with GH distance; see same 

section for (VW ( fc ), dg H ) 10 

{M. , dcHp) Set of measured isometry-equivalence classes of compact measured metric spaces, 

with GHP distance; see same section for (A4 k ' 1 , dj^p) 11 

.M* Set of pairs (X, Y) where X is a compact metric space and Y <Z X \s compact; 

see same section for the marked Gromov-Hausdorff topology 33 

jK\ JC\ renormalised to have mass one 32 

M,' M^' renormalised to have mass one 28 

M.™,M™,JC\ Sequence of components (M^'*,z > 1) of M(n, 1/n + A/n 4 / 3 ); see same section 

for measured metric space version and limit ^\ 24 

M n Minimum spanning tree of K n (as a graph) 3 

M n , ^# M n is the measured metric space version of MP; is its GHP limit 5 

M n , M"' 1 , JC, JC{ Spaces obtained from M n , M^\jt , JC{ by ignoring measures 27 

M(X, X') Set of finite non-negative Borel measures on X x X' 11 

N(K, r) Minimal number of open balls of radius r needed to cover X 34 

oe(x) Say f(x) = oe(x) if |/(x)| < cexp(— c'x € ) for some c, c',e and x large 26 

Push-forward of measure fi under a map (j) 11 

M-graph See Definition 2.2 14 

M-tree Acyclic geodesic metric space 13 

r(X) Minimal length of a core edge in X 21 

i?(X) Largest e such that B e ^(x) is an M-tree for all x £ X 40 

safely pointed See Definition 3.2 17 

s(G), s(X) Surplus of G and of X 15 

skel(X) Points of degree at least two in an M-graph X 14 

£%(v) SizeofF^(w) 28 

& BrownianCRT 5 

V(G) Set of vertices of the graph G 3 

W\ Brownian motion with parabolic drift 20 

X A metric space, possibly decorated with measures and/or points 10 
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[X,d] Isometry class of the metric space (X, d) 10 

(X, d, fi), [X, d, fi] (X, d, fi) is a measured metric space; /x is a finite measure on X. [X, d, fj] is its 

measured isometry-equivalence class 11 

(X x ,d x ) X = (X, d) or X = (X, d, /x) cut at the point x 6 X; see also Section 7.1. ... 17 
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